Commit Graph

  • e4abb4cbd0
    Fix typo Tino Desjardins 2020-03-29 07:48:13 +0200
  • 0654babb42
    Update cirrus-extract.py HjalmarrSv 2020-03-13 16:04:37 +0100
  • f3a18923bc
    Update README.md HjalmarrSv 2020-03-08 18:52:52 +0100
  • 1f708177a2
    Updated HjalmarrSv 2020-03-08 18:41:43 +0100
  • 69d0e766c3
    Update README.md HjalmarrSv 2020-03-07 15:41:34 +0100
  • 1f89728ca8
    fix for en, new folder structure HjalmarrSv 2020-03-07 15:31:48 +0100
  • be4183cd25
    Merge 1305cf6746 into 16186e290d zmleok 2020-03-05 00:47:54 +0300
  • b52469f76b
    Merge fc48a575f9 into 16186e290d operator-name 2020-03-05 00:47:46 +0300
  • bb34fde0ab
    Merge 45662a5c91 into 16186e290d Albert Villanova del Moral 2020-03-05 00:47:35 +0300
  • 3b4bf17b37
    Merge 124bcd6a42 into 16186e290d bierus 2020-03-05 00:41:42 +0300
  • c738549d10
    Merge 5a65bbdb25 into 16186e290d Ahad Suleymanli 2020-03-03 15:05:50 +0000
  • 4df4c86af7
    Merge fa60ec3f8e into 16186e290d Necmettin Çarkacı 2020-03-03 08:13:26 +0000
  • 86adb42a72
    If / gives problems in an os that uses \ in dir path HjalmarrSv 2020-03-02 10:41:13 +0100
  • 16186e290d
    Update WikiExtractor.py Giuseppe Attardi 2020-03-01 16:29:00 +0100
  • e3dca79742
    Update WikiExtractor.py Giuseppe Attardi 2020-03-01 16:23:39 +0100
  • 07552f782c
    Merge bc781f25aa into 3162bb6c3c HjalmarrSv 2020-03-01 07:48:06 +0800
  • bc781f25aa
    Adding comments HjalmarrSv 2020-03-01 00:31:05 +0100
  • 392fddf103
    Update README.md HjalmarrSv 2020-02-29 20:52:58 +0100
  • 4a4650d14f
    creates article files instead HjalmarrSv 2020-02-29 20:47:18 +0100
  • d0f2529b45
    fix for revid HjalmarrSv 2020-02-28 12:14:35 +0100
  • 3133906f88
    Update comments: Lua wants! HjalmarrSv 2020-02-07 11:09:31 +0100
  • e974947fc4
    too buggy - works when function shortcut, else not HjalmarrSv 2020-02-07 00:32:31 +0100
  • 1b2a695208
    the obvious bugs HjalmarrSv 2020-02-06 22:49:58 +0100
  • 3f672b0905
    expand formatnum functionality HjalmarrSv 2020-02-06 16:26:17 +0100
  • 01668eb045
    use raw, especially where there are \ HjalmarrSv 2020-02-06 11:09:26 +0100
  • 9c7d502f00
    Update comments in code HjalmarrSv 2020-02-05 11:08:56 +0100
  • d90b9e75db
    remove empty ( ) HjalmarrSv 2020-02-04 00:18:41 +0100
  • c4136d03d2
    remove multiples of '/n' HjalmarrSv 2020-02-03 23:52:23 +0100
  • 43a9b9d2a8
    basic formatdate and dateformat parsing HjalmarrSv 2020-02-03 22:02:29 +0100
  • 56d9852950
    improve cleaning HjalmarrSv 2020-02-03 21:13:00 +0100
  • 7a824b8c65
    Additional cleaning needed HjalmarrSv 2020-02-03 00:34:53 +0100
  • 970f94be2d
    set 'options.cleaned = True' HjalmarrSv 2020-02-03 00:22:59 +0100
  • c2353574a2
    this way it actually works HjalmarrSv 2020-02-02 23:07:14 +0100
  • 40bd592e50
    clean '! ...' - lines HjalmarrSv 2020-02-02 22:46:28 +0100
  • 063d3d12f1
    preparing for #dateformat tags HjalmarrSv 2020-02-02 14:32:40 +0100
  • effcc94e83
    clean up templatestyles HjalmarrSv 2020-01-30 00:45:17 +0100
  • fbb1199ad8
    version update HjalmarrSv 2020-01-29 22:02:08 +0100
  • a729546689
    a little better formatnum HjalmarrSv 2020-01-29 21:40:27 +0100
  • 89104740a1
    bug fix HjalmarrSv 2020-01-29 13:10:34 +0100
  • 0950b893e1
    --decimalcomma; decimal separator comma (,) for formatnum HjalmarrSv 2020-01-29 13:06:21 +0100
  • 63278c7d29
    refresh comments HjalmarrSv 2020-01-28 21:31:08 +0100
  • 7ccd3f81f1
    added example HjalmarrSv 2020-01-28 20:15:53 +0100
  • 62d2b0648d
    basic support for 'formatnum' HjalmarrSv 2020-01-28 19:51:12 +0100
  • bbdb4a80a7
    revert on 'small' HjalmarrSv 2020-01-27 22:26:29 +0100
  • fd1c7caeb1
    Update WikiExtractor.py HjalmarrSv 2020-01-27 20:34:39 +0100
  • 34ee0dc88c
    disambiguate command line parameter HjalmarrSv 2020-01-27 20:32:43 +0100
  • 49e3cd8374
    Added {{grammar: ...}} HjalmarrSv 2020-01-27 14:05:26 +0100
  • 6d755bff3d
    Added switch: '__NOGLOBAL__' HjalmarrSv 2020-01-26 00:59:20 +0100
  • 3b8bb7035d
    Make visible what is not supported currently HjalmarrSv 2020-01-25 14:18:58 +0100
  • 8bb7497a83
    bug fix HjalmarrSv 2020-01-25 00:09:34 +0100
  • 00a322ec7e
    Update README.md HjalmarrSv 2020-01-24 10:44:47 +0100
  • 6c64e75411
    --sentences, now at least 2 per article HjalmarrSv 2020-01-23 23:18:58 +0100
  • 24e6e85d95
    fix tab error HjalmarrSv 2020-01-23 20:29:15 +0100
  • 8a78aea725
    --sentences and --raw HjalmarrSv 2020-01-23 13:00:11 +0100
  • 5a65bbdb25 let number of output directories be specified as an argument ahadsuleymanli 2020-01-23 13:07:36 +0300
  • c4321d01e0
    remove broken sentences HjalmarrSv 2020-01-21 20:38:21 +0100
  • d875198d08
    only one space before caret HjalmarrSv 2020-01-21 00:45:34 +0100
  • e3657c12e4
    Add urlbase to verbosity HjalmarrSv 2020-01-20 13:15:53 +0100
  • e399a97b84
    Remove code making gzip conditional HjalmarrSv 2020-01-20 12:58:19 +0100
  • 7b8a9d1cea
    Example on text only HjalmarrSv 2020-01-20 00:44:16 +0100
  • de50af02e8
    text only option HjalmarrSv 2020-01-20 00:33:23 +0100
  • 7047392852
    Example on cirrus dump HjalmarrSv 2020-01-19 22:52:39 +0100
  • a3cc0363d9
    fixed breaking errors HjalmarrSv 2020-01-19 22:49:10 +0100
  • d831277b57
    bugfixes HjalmarrSv 2020-01-16 21:06:49 +0100
  • 6088107253
    Update README.md HjalmarrSv 2020-01-16 10:41:55 +0100
  • eaf3d3480b
    --raw, --abstract_only HjalmarrSv 2020-01-16 10:31:48 +0100
  • 3255faea56
    currently no install HjalmarrSv 2020-01-12 11:13:45 +0100
  • ff7cf4dc66
    templates_only option and dropNested within and between ref tags HjalmarrSv 2020-01-09 14:36:27 +0100
  • 48dc35116a
    Update README.md HjalmarrSv 2020-01-09 13:48:45 +0100
  • 5ab7485265
    Update README.md HjalmarrSv 2020-01-08 20:36:57 +0100
  • e190f33a7f
    Update README.md HjalmarrSv 2020-01-08 20:24:24 +0100
  • ea73ede54c
    bug fix + a bit of removing of debug comments HjalmarrSv 2020-01-08 20:04:14 +0100
  • 0b9091b7bd
    --verbose and <BR> HjalmarrSv 2020-01-08 12:32:36 +0100
  • a63ab3a9d9
    Rollback of code for termination HjalmarrSv 2020-01-08 00:04:50 +0100
  • 050b586518
    fixed the last introduced bug HjalmarrSv 2020-01-07 22:21:19 +0100
  • 02757c61b3
    Handle broken pipe and keyboard interrupts HjalmarrSv 2020-01-07 22:12:05 +0100
  • fe2b6f2ea0
    cgi.escape with html.escape HjalmarrSv 2020-01-07 20:21:01 +0100
  • 7c37e62661
    discard score tags HjalmarrSv 2020-01-06 22:48:53 +0100
  • aef4bc8be7
    remove mapframe, maplink HjalmarrSv 2020-01-06 20:25:49 +0100
  • 761f67b53d
    max_articles, remove-html-tags in tested example HjalmarrSv 2020-01-06 19:54:18 +0100
  • 968324829a
    bug fixes HjalmarrSv 2020-01-06 19:41:16 +0100
  • 2f80d1ce13
    --max_articles HjalmarrSv 2020-01-06 18:20:11 +0100
  • b2c4a47ba1
    Option to restrict to specific pages by title HjalmarrSv 2020-01-05 13:54:40 +0100
  • 41fdefcfc2
    Update README.md HjalmarrSv 2020-01-05 00:38:10 +0100
  • c6b5d39d06
    Remove inline flags from the middle of a regex HjalmarrSv 2020-01-04 17:21:41 +0100
  • 5ae4bcd3f5
    Force 'utf-8' encoding HjalmarrSv 2020-01-04 17:12:00 +0100
  • 82f6d3cb10
    New example for Bert and json HjalmarrSv 2020-01-04 15:41:09 +0100
  • a4e4fdcfe9
    Augmented key regex to catch plus/minus signs HjalmarrSv 2020-01-04 13:02:35 +0100
  • d7eb4069bb
    broken middoc interwiki link HjalmarrSv 2020-01-04 12:38:53 +0100
  • bfda827dd2
    Template expansion HjalmarrSv 2020-01-04 12:21:18 +0100
  • 58d4352db4
    Concatenate many files to one file HjalmarrSv 2020-01-04 08:43:59 +0100
  • 703249a7e1
    Update README.md HjalmarrSv 2020-01-04 01:28:18 +0100
  • b4420f350f
    Update README.md HjalmarrSv 2020-01-04 01:14:17 +0100
  • 8bed74723e
    Added better example HjalmarrSv 2020-01-04 01:13:21 +0100
  • 33251533f7
    --for-bert HjalmarrSv 2020-01-04 00:46:37 +0100
  • a32e7b10c3
    Fix: separate docs HjalmarrSv 2020-01-04 00:34:59 +0100
  • 67903abe57
    Implement josecannete/wikiextractorforBERT HjalmarrSv 2020-01-04 00:11:25 +0100
  • ea31c41d96
    Added mediawiki links in comments HjalmarrSv 2019-12-31 12:31:53 +0100
  • d2ce1a8587
    spelling HjalmarrSv 2019-12-31 01:50:51 +0100
  • 4f192edeb4
    Update README.md HjalmarrSv 2019-12-29 12:57:18 +0100