-
cfe1092f21
fixing issue #62 by eliminating some empty comments that are triggering a bug in the Scrapy library. 'https://github.com/scrapy/scrapy/issues/4477'
Di-ref
2020-07-26 23:24:01 +0200
-
8897b58831
Merge
93e19b6582
into bda7d6a7da
padraic
2020-02-21 13:06:55 +0800
-
-
93e19b6582
Merge pull request #1 from jaewonha/master
padraic
2020-02-18 12:10:01 +0000
-
-
5fefe2fe28
Update README.md
padraic
2020-02-15 15:44:51 +0000
-
-
7be4015f13
likers - press the more button and collect all reactions (currently includes angry reactions)
jaewnoha
2020-02-12 02:57:00 +0900
-
-
9a403c9f86
Merge
d7ba4bc0b2
into bda7d6a7da
laols574
2020-02-04 15:33:24 -0700
-
-
d7ba4bc0b2
Update fbcrawl.py
laols574
2020-02-04 15:15:42 -0700
-
25cf067cc2
Update comments.py
laols574
2020-02-04 15:11:02 -0700
-
9f0149f135
Update items.py
laols574
2020-02-04 14:45:17 -0700
-
-
6ba563f5ae
Merge
03e65d5fcc
into bda7d6a7da
Eliezer Pearl
2019-07-19 17:56:21 +0000
-
-
bda7d6a7da
Update README.md
master
Rugantio Costa
2019-07-19 04:12:11 +0200
-
03e65d5fcc
add going, interested, shared fields to eventsitem
Eliezer Pearl
2019-07-18 12:30:51 -0400
-
388da75075
Add going, interested, and shared counts
Eliezer Pearl
2019-07-18 12:28:44 -0400
-
-
be5d55b6f2
added group schema and improve text
theblackcat102
2019-07-12 03:40:42 +0800
-
14e15587d6
update readme with setup
theblackcat102
2019-07-10 16:44:26 +0800
-
6729c8b220
random shuffle multiple groups
theblackcat102
2019-07-09 12:20:28 +0800
-
d80501751a
use original agent
theblackcat102
2019-07-09 11:53:54 +0800
-
6e13e8ac3e
fix missing imports
theblackcat102
2019-07-08 20:22:35 +0800
-
8182269bd8
fix missing imports
theblackcat102
2019-07-08 19:52:23 +0800
-
ededfe261e
support for multi urls
theblackcat102
2019-07-08 19:49:26 +0800
-
b983c81bbc
remove sensitive details from cmd params
theblackcat102
2019-07-07 13:31:59 +0800
-
bd2282f862
add missing requirements
theblackcat102
2019-07-07 12:45:12 +0800
-
2549bdb081
finish testing postgres pipeline
theblackcat102
2019-07-07 12:43:29 +0800
-
9c46515d99
update dateformat issue
theblackcat102
2019-07-07 01:49:15 +0800
-
-
830265d5cd
Merge pull request #40 from StefanYohansson/feat/events-crawler
Rugantio Costa
2019-07-05 02:11:04 +0200
-
-
c31bf61b94
fix runner path
StefanYohansson
2019-07-04 20:43:00 -0300
-
ef352b6030
facebook runner
StefanYohansson
2019-07-04 18:18:44 -0300
-
5c3128cfef
Adding events crawler
StefanYohansson
2019-06-29 22:01:04 -0300
-
-
be8a9c2f5f
Adding profile crawler
rugantio
2019-05-18 19:50:46 +0200
-
0cda010e64
Adding profile crawler
rugantio
2019-05-18 19:50:33 +0200
-
cdde9d198f
Pr/10 (#12)
RZK
2019-05-13 14:52:18 +0300
-
dad1a19435
Merge branch 'master' of https://github.com/rugantio/fbcrawl
rugantio
2019-05-11 15:23:54 +0200
-
-
d875e89c52
Adding features: crawl comments from page, crawl posts and comments from groups
rugantio
2019-05-11 15:22:56 +0200
-
f97f385403
Merge branch 'master' into AssetsConsts
RZK
2019-05-07 13:50:56 +0300
-
-
-
-
8bca3f25dd
Merging changes related to source (#7)
RZK
2019-05-07 13:14:35 +0300
-
a0e90aa467
Merge pull request #25 from JoshuaKissoon/master
Rugantio Costa
2019-05-02 13:20:13 +0200
-
-
f4900d6d29
Fixed an issue where "Shar" was being returned as the number of comments on posts without comments
Joshua Kissoon
2019-05-01 23:10:49 -0400
-
-
a4e73c3a31
Merge pull request #4 from ahmad-rzk/pr/2
RZK
2019-05-01 10:51:50 +0300
-
-
5a04bf589f
Merge pull request #3 from ahmad-rzk/pr/2
RZK
2019-05-01 10:46:26 +0300
-
-
4ea79f71ec
Finalzing PR2
Ahmad Rzk
2019-05-01 10:44:53 +0300
-
-
0085299026
Merge branch 'master' into pr/2
Ahmad Rzk
2019-05-01 10:08:11 +0300
-
-
-
213da30a00
Merge pull request #1 from ahmad-rzk/output_to_TCP_MQ
RZK
2019-05-01 09:12:20 +0300
-
-
f6e8545236
[fb] restored reaction count "it"; cleanup old parse_date
rugantio
2019-05-01 04:17:29 +0200
-
57060d7d71
[fb] restored reaction count "it"; cleanup old parse_date
rugantio
2019-05-01 04:16:23 +0200
-
85d9b1af42
fix for issue #22 and #24, fix timestamp conversion, cleanup
rugantio
2019-05-01 02:45:17 +0200
-
24268374ea
Basees of ZMQ integration
Ahmad Rzk
2019-04-30 17:20:35 +0300
-
-
a117cae507
[fbcrawl] fixing
date
attribute parsing
rugantio
2019-04-29 18:53:42 +0200
-
ea431c029c
[fbcrawl] fixing
date
attribute parsing
rugantio
2019-04-29 18:53:09 +0200
-
c1483fb58d
Update comments.py
Ahmad Rzk
2019-04-28 15:26:14 +0300
-
88553b60a8
Update fbcrawl.py
Ahmad Rzk
2019-04-28 15:22:19 +0300
-
-
0ba544831c
Relying on const assets files instead of hardcoded xpaths
Ahmad Rzk
2019-04-28 15:16:50 +0300
-
-
55dc799374
blocking mitigation
rugantio
2019-04-25 23:41:33 +0200
-
4a379f3af4
added post_id column
rugantio
2019-04-24 17:26:53 +0200
-
8baa108aab
removing date pipeline
rugantio
2019-04-23 08:22:48 +0200
-
c394575137
correct README for the new "date" attribute
rugantio
2019-04-23 07:33:37 +0200
-
efda9a956e
[fb] in items.py refactoring parse_date, introducing "date" attribute
rugantio
2019-04-23 07:31:23 +0200
-
1acf5c2106
[comments.py] Added new source_url column
rugantio
2019-04-23 04:18:44 +0200
-
3d32ab6054
[comments.py] Added new source_url column
rugantio
2019-04-23 04:00:22 +0200
-
462cb0eff1
[comments.py] Added support for groups
rugantio
2019-04-23 03:41:52 +0200
-
2d404a7667
docs for new spider
rugantio
2019-02-18 18:51:52 +0100
-
dc1d0f29c0
refactoring comments spider
rugantio
2019-02-18 07:18:34 +0100
-
069f64f61e
refactoring comments spider
rugantio
2019-02-18 05:09:21 +0100
-
f0cf9599e1
refactoring comments spider
rugantio
2019-02-18 05:08:42 +0100
-
bd41255361
refactoring comments spider
rugantio
2019-02-18 05:07:38 +0100
-
96d3423b8d
Merge branch 'master' of https://github.com/rugantio/fbcrawl
rugantio
2019-02-18 02:14:01 +0100
-
-
b3d12c4e6b
refactoring comments spider
rugantio
2019-02-18 02:12:52 +0100
-
98642cffd8
Update README.md
Rugantio Costa
2019-02-05 04:49:57 +0100
-
-
bdeae9f4b5
parse_page refactoring complete
rugantio
2019-02-05 03:48:00 +0100
-
71f80356dc
fixed attribute parsing
rugantio
2019-02-04 20:27:26 +0100
-
811f4e396d
fixed attribute parsing
rugantio
2019-02-04 20:25:54 +0100
-
d28d214993
fixed recursion on pages
rugantio
2019-02-04 19:27:44 +0100
-
dafd01c8bd
fixed recursion on pages
rugantio
2019-02-04 19:26:00 +0100
-
918cd9ce64
added new features, simplified presentation
rugantio
2019-01-31 07:28:08 +0100
-
a9982865d9
improved support for languages en, es, fr, it, pt
rugantio
2019-01-31 06:54:31 +0100
-
fb32a4213e
added experimental support for languages en, es, fr, it, pt
rugantio
2019-01-30 20:34:25 +0100
-
9de51e0ce8
steady recursion implemented
rugantio
2019-01-30 17:30:18 +0100
-
b24fc61dbb
steady recursion implemented
rugantio
2019-01-30 17:21:43 +0100
-
eaaa2a32e3
cleaning up
rugantio
2019-01-29 22:27:55 +0100
-
17430a06a9
fixed datetime parser
rugantio
2019-01-29 22:26:01 +0100
-
b06883dc3c
fixed gitignore
rugantio
2019-01-29 22:08:37 +0100
-
8d343b057c
gitignore added
rugantio
2019-01-29 21:57:48 +0100
-
b8d8444b3f
fix xpath in comment crawler
rugantio
2019-01-26 02:52:26 +0100
-
80a10f176f
fix xpath in comment crawler
rugantio
2019-01-26 02:51:18 +0100
-
0c0f3129cd
Update README.md
Rugantio Costa
2018-12-27 02:20:46 +0100
-
e95c70c844
Update README.md
Rugantio Costa
2018-12-27 01:47:29 +0100
-
c04509499f
changed user-agent and fixed date parsers in items.py
rugantio
2018-12-13 05:33:09 +0100
-
30c04c2fca
changed user-agent and fixed date parsers in items.py
rugantio
2018-12-13 05:31:18 +0100
-
168bc2c510
disabling pipeline
rugantio
2018-11-22 21:22:18 +0100
-
a153b1fa7a
Update README.md
Rugantio Costa
2018-11-18 00:46:28 +0100
-
fdff7b826c
Update README.md
Rugantio Costa
2018-10-22 00:35:56 +0200
-
30ba7dd125
new spider is introduced
Rugantio Costa
2018-10-22 00:33:20 +0200
-
2a4ac3a5e2
new spider for comments
rugantio
2018-10-22 00:24:57 +0200
-
728516617f
Update README.md
Rugantio Costa
2018-09-19 01:39:20 +0200
-
0477d9a780
final
rugantio
2018-08-27 02:23:57 +0200
-
ae2b0698f6
final
rugantio
2018-08-27 02:23:25 +0200
-
cf6313e4b1
final
rugantio
2018-08-27 02:21:51 +0200
-
8359748b81
final
rugantio
2018-08-26 17:09:16 +0200
-
a26cbd969c
final
rugantio
2018-08-26 15:43:06 +0200
-
1f1d6775af
final
rugantio
2018-08-26 15:41:06 +0200
-
a6d39a4a7e
final
rugantio
2018-08-26 14:12:47 +0200