Giuseppe Attardi
d8a15dd0ba
Merge pull request #27 from gojomo/multiprocessing
...
multiprocess speedup; stdin/stdout/single-file options; stable ordering; sparser progress logging
2015-06-20 08:58:10 +02:00
Gordon Mohr
55beb4a426
restore default section-handling
2015-06-19 18:06:34 -07:00
Gordon Mohr
d420d729e7
more summary/timing logging; less bulk/repeat logging
2015-06-19 17:42:21 -07:00
Gordon Mohr
5b647e2249
stable ordering; skip dups; accept compressed tempate-file
2015-06-19 03:15:45 -07:00
Gordon Mohr
190aae11a1
up processes default to # cores
2015-06-18 14:52:20 -07:00
Gordon Mohr
e3515e2ecf
single-file; stdout; dir/multi-file
2015-06-18 14:49:03 -07:00
Gordon Mohr
5d32701400
messy 1st approach
2015-06-17 18:49:26 -07:00
Giuseppe Attardi
694cd5a7f4
Merge pull request #25 from dragoon/patch-2
...
multiline tag match fix
2015-06-14 09:12:30 +02:00
Roman Prokofyev
70ee947a8b
multiline tag match fix
...
need to add re.DOTALL so that multiline tag definitions are also matched
2015-06-12 14:23:14 +02:00
Giuseppe Attardi
625d4b69b3
Merge pull request #24 from dragoon/patch-1
...
Fix regex for <ref> tag when it's not self-closing
2015-06-10 18:40:48 +02:00
Roman Prokofyev
b99aaf19aa
Fix regex for <ref> tag when it's not self-closing
...
In some articles <ref> tag appears like this:
<ref name="Ahmed Rashid/The Telegraph">{{cite
Previous regex breaks when it sees the forward slash ("Rashid/The"). New regex stops at the earliest occurrence of the closing bracket, no need to pre-filter characters.
2015-06-10 15:40:27 +02:00
Giuseppe Attardi
c15d93c40a
See ChangeLog.
2015-06-03 00:06:35 +02:00
Giuseppe Attardi
147e36df5b
See ChangeLog.
2015-06-03 00:01:45 +02:00
Giuseppe Attardi
5b0d88a16c
See ChangeLog.
2015-05-29 20:52:27 +02:00
Giuseppe Attardi
f041f9143f
Merge branch 'master' of https://github.com/attardi/wikiextractor
2015-05-06 16:09:12 +02:00
Giuseppe Attardi
d5cca5da43
See ChangeLog.
2015-05-06 16:08:27 +02:00
Giuseppe Attardi
45b4658e72
Update README.md
2015-04-26 08:57:25 +02:00
Giuseppe Attardi
b44b750056
Fixes for Chinese.
2015-04-26 08:47:53 +02:00
Giuseppe Attardi
f56f44caee
Handle chinese characters in #expr.
2015-04-25 11:52:01 +02:00
Giuseppe Attardi
3524141cef
Set UTF-8 as default.
2015-04-23 12:35:39 +02:00
Giuseppe Attardi
af68c87b3b
See ChangeLog.
2015-04-22 17:07:08 +02:00
Giuseppe Attardi
55ac23ebe6
Fix to replaceInternalLinks.
2015-04-22 13:41:39 +02:00
Giuseppe Attardi
b949bdad5a
Merge branch 'master' of https://github.com/attardi/wikiextractor
...
Edit done on site.
2015-04-22 12:43:24 +02:00
Giuseppe Attardi
363dd47666
See ChangeLog.
2015-04-22 12:42:42 +02:00
Giuseppe Attardi
4ecf09470d
Update README.md
2015-04-21 17:27:34 +02:00
Giuseppe Attardi
094e601327
Update README.md
2015-04-21 17:22:41 +02:00
Giuseppe Attardi
4e07a6c149
See ChangeLog.
2015-04-20 21:19:05 +02:00
Giuseppe Attardi
2f35bcf9e0
See ChangeLog.
2015-04-20 07:14:29 +02:00
Giuseppe Attardi
cae955eb91
See ChangeLog.
2015-04-20 06:56:29 +02:00
Giuseppe Attardi
4c90d10860
See ChangeLog.
2015-04-20 06:19:32 +02:00
Giuseppe Attardi
66caade3ad
See ChangeLog.
2015-04-19 13:17:48 +02:00
Giuseppe Attardi
858670beeb
See ChangeLog/
2015-04-19 11:32:39 +02:00
Giuseppe Attardi
1485e2b5bc
See ChangeLog.
2015-04-19 00:18:48 +02:00
Giuseppe Attardi
54814eb806
See ChangeLog.
2015-04-17 00:37:30 +02:00
Giuseppe Attardi
e3c3db047a
See ChangeLog.
2015-04-16 21:00:35 +02:00
Giuseppe Attardi
81476d4b13
See ChangeLog.
2015-04-15 18:20:32 +02:00
Giuseppe Attardi
c415cccf33
See ChangeLog.
2015-04-15 14:30:55 +02:00
Giuseppe Attardi
828c7e7e0a
See ChangeLog
2015-04-15 10:47:02 +02:00
Giuseppe Attardi
d6491e0763
See ChangeLog.
2015-04-15 03:43:02 +02:00
Giuseppe Attardi
f4e416ba3b
See ChangeLog.
2015-04-15 00:09:51 +02:00
Giuseppe Attardi
3a64bb0dd3
See Changelog.
2015-04-14 18:09:46 +02:00
Giuseppe Attardi
c907fd6f3f
Update README.md
2015-04-12 11:18:19 +02:00
Giuseppe Attardi
b36dcfc56a
New options.
2015-04-12 11:05:52 +02:00
Giuseppe Attardi
8ba0aba2f8
Enabled multithread version.
2015-04-12 10:21:35 +02:00
Giuseppe Attardi
e59196f26e
See ChangeLog.
2015-04-11 15:33:20 +02:00
Giuseppe Attardi
74ebbdbd85
see ChangeLog.
2015-04-11 12:29:31 +02:00
Giuseppe Attardi
cc6c077546
fix to loop in parameter expansion.
2015-04-11 03:43:32 +02:00
Giuseppe Attardi
9f49bada4d
Fixed infinite loop in template expansion.
2015-04-09 15:24:34 +02:00
Giuseppe Attardi
bc8066fcca
Update README.md
2015-03-22 13:59:58 +01:00
Giuseppe Attardi
f9121043ec
Update README.md
2015-03-22 13:58:50 +01:00