Commit Graph

  • cfe1092f21 fixing issue #62 by eliminating some empty comments that are triggering a bug in the Scrapy library. 'https://github.com/scrapy/scrapy/issues/4477' Di-ref 2020-07-26 23:24:01 +0200
  • 8897b58831
    Merge 93e19b6582 into bda7d6a7da padraic 2020-02-21 13:06:55 +0800
  • 93e19b6582
    Merge pull request #1 from jaewonha/master padraic 2020-02-18 12:10:01 +0000
  • 5fefe2fe28
    Update README.md padraic 2020-02-15 15:44:51 +0000
  • 7be4015f13 likers - press the more button and collect all reactions (currently includes angry reactions) jaewnoha 2020-02-12 02:57:00 +0900
  • 9a403c9f86
    Merge d7ba4bc0b2 into bda7d6a7da laols574 2020-02-04 15:33:24 -0700
  • d7ba4bc0b2
    Update fbcrawl.py laols574 2020-02-04 15:15:42 -0700
  • 25cf067cc2
    Update comments.py laols574 2020-02-04 15:11:02 -0700
  • 9f0149f135
    Update items.py laols574 2020-02-04 14:45:17 -0700
  • 6ba563f5ae
    Merge 03e65d5fcc into bda7d6a7da Eliezer Pearl 2019-07-19 17:56:21 +0000
  • bda7d6a7da
    Update README.md master Rugantio Costa 2019-07-19 04:12:11 +0200
  • 03e65d5fcc add going, interested, shared fields to eventsitem Eliezer Pearl 2019-07-18 12:30:51 -0400
  • 388da75075
    Add going, interested, and shared counts Eliezer Pearl 2019-07-18 12:28:44 -0400
  • be5d55b6f2 added group schema and improve text theblackcat102 2019-07-12 03:40:42 +0800
  • 14e15587d6 update readme with setup theblackcat102 2019-07-10 16:44:26 +0800
  • 6729c8b220 random shuffle multiple groups theblackcat102 2019-07-09 12:20:28 +0800
  • d80501751a use original agent theblackcat102 2019-07-09 11:53:54 +0800
  • 6e13e8ac3e fix missing imports theblackcat102 2019-07-08 20:22:35 +0800
  • 8182269bd8 fix missing imports theblackcat102 2019-07-08 19:52:23 +0800
  • ededfe261e support for multi urls theblackcat102 2019-07-08 19:49:26 +0800
  • b983c81bbc remove sensitive details from cmd params theblackcat102 2019-07-07 13:31:59 +0800
  • bd2282f862 add missing requirements theblackcat102 2019-07-07 12:45:12 +0800
  • 2549bdb081 finish testing postgres pipeline theblackcat102 2019-07-07 12:43:29 +0800
  • 9c46515d99 update dateformat issue theblackcat102 2019-07-07 01:49:15 +0800
  • 830265d5cd
    Merge pull request #40 from StefanYohansson/feat/events-crawler Rugantio Costa 2019-07-05 02:11:04 +0200
  • c31bf61b94 fix runner path StefanYohansson 2019-07-04 20:43:00 -0300
  • ef352b6030 facebook runner StefanYohansson 2019-07-04 18:18:44 -0300
  • 5c3128cfef Adding events crawler StefanYohansson 2019-06-29 22:01:04 -0300
  • be8a9c2f5f Adding profile crawler rugantio 2019-05-18 19:50:46 +0200
  • 0cda010e64 Adding profile crawler rugantio 2019-05-18 19:50:33 +0200
  • cdde9d198f
    Pr/10 (#12) RZK 2019-05-13 14:52:18 +0300
  • dad1a19435 Merge branch 'master' of https://github.com/rugantio/fbcrawl rugantio 2019-05-11 15:23:54 +0200
  • d875e89c52 Adding features: crawl comments from page, crawl posts and comments from groups rugantio 2019-05-11 15:22:56 +0200
  • f97f385403
    Merge branch 'master' into AssetsConsts RZK 2019-05-07 13:50:56 +0300
  • 8bca3f25dd
    Merging changes related to source (#7) RZK 2019-05-07 13:14:35 +0300
  • a0e90aa467
    Merge pull request #25 from JoshuaKissoon/master Rugantio Costa 2019-05-02 13:20:13 +0200
  • f4900d6d29 Fixed an issue where "Shar" was being returned as the number of comments on posts without comments Joshua Kissoon 2019-05-01 23:10:49 -0400
  • a4e73c3a31
    Merge pull request #4 from ahmad-rzk/pr/2 RZK 2019-05-01 10:51:50 +0300
  • 5a04bf589f
    Merge pull request #3 from ahmad-rzk/pr/2 RZK 2019-05-01 10:46:26 +0300
  • 4ea79f71ec Finalzing PR2 Ahmad Rzk 2019-05-01 10:44:53 +0300
  • 0085299026 Merge branch 'master' into pr/2 Ahmad Rzk 2019-05-01 10:08:11 +0300
  • 213da30a00
    Merge pull request #1 from ahmad-rzk/output_to_TCP_MQ RZK 2019-05-01 09:12:20 +0300
  • f6e8545236 [fb] restored reaction count "it"; cleanup old parse_date rugantio 2019-05-01 04:17:29 +0200
  • 57060d7d71 [fb] restored reaction count "it"; cleanup old parse_date rugantio 2019-05-01 04:16:23 +0200
  • 85d9b1af42 fix for issue #22 and #24, fix timestamp conversion, cleanup rugantio 2019-05-01 02:45:17 +0200
  • 24268374ea Basees of ZMQ integration Ahmad Rzk 2019-04-30 17:20:35 +0300
  • a117cae507 [fbcrawl] fixing date attribute parsing rugantio 2019-04-29 18:53:42 +0200
  • ea431c029c [fbcrawl] fixing date attribute parsing rugantio 2019-04-29 18:53:09 +0200
  • c1483fb58d Update comments.py Ahmad Rzk 2019-04-28 15:26:14 +0300
  • 88553b60a8 Update fbcrawl.py Ahmad Rzk 2019-04-28 15:22:19 +0300
  • 0ba544831c Relying on const assets files instead of hardcoded xpaths Ahmad Rzk 2019-04-28 15:16:50 +0300
  • 55dc799374 blocking mitigation rugantio 2019-04-25 23:41:33 +0200
  • 4a379f3af4 added post_id column rugantio 2019-04-24 17:26:53 +0200
  • 8baa108aab removing date pipeline rugantio 2019-04-23 08:22:48 +0200
  • c394575137 correct README for the new "date" attribute rugantio 2019-04-23 07:33:37 +0200
  • efda9a956e [fb] in items.py refactoring parse_date, introducing "date" attribute rugantio 2019-04-23 07:31:23 +0200
  • 1acf5c2106 [comments.py] Added new source_url column rugantio 2019-04-23 04:18:44 +0200
  • 3d32ab6054 [comments.py] Added new source_url column rugantio 2019-04-23 04:00:22 +0200
  • 462cb0eff1 [comments.py] Added support for groups rugantio 2019-04-23 03:41:52 +0200
  • 2d404a7667 docs for new spider rugantio 2019-02-18 18:51:52 +0100
  • dc1d0f29c0 refactoring comments spider rugantio 2019-02-18 07:18:34 +0100
  • 069f64f61e refactoring comments spider rugantio 2019-02-18 05:09:21 +0100
  • f0cf9599e1 refactoring comments spider rugantio 2019-02-18 05:08:42 +0100
  • bd41255361 refactoring comments spider rugantio 2019-02-18 05:07:38 +0100
  • 96d3423b8d Merge branch 'master' of https://github.com/rugantio/fbcrawl rugantio 2019-02-18 02:14:01 +0100
  • b3d12c4e6b refactoring comments spider rugantio 2019-02-18 02:12:52 +0100
  • 98642cffd8
    Update README.md Rugantio Costa 2019-02-05 04:49:57 +0100
  • bdeae9f4b5 parse_page refactoring complete rugantio 2019-02-05 03:48:00 +0100
  • 71f80356dc fixed attribute parsing rugantio 2019-02-04 20:27:26 +0100
  • 811f4e396d fixed attribute parsing rugantio 2019-02-04 20:25:54 +0100
  • d28d214993 fixed recursion on pages rugantio 2019-02-04 19:27:44 +0100
  • dafd01c8bd fixed recursion on pages rugantio 2019-02-04 19:26:00 +0100
  • 918cd9ce64 added new features, simplified presentation rugantio 2019-01-31 07:28:08 +0100
  • a9982865d9 improved support for languages en, es, fr, it, pt rugantio 2019-01-31 06:54:31 +0100
  • fb32a4213e added experimental support for languages en, es, fr, it, pt rugantio 2019-01-30 20:34:25 +0100
  • 9de51e0ce8 steady recursion implemented rugantio 2019-01-30 17:30:18 +0100
  • b24fc61dbb steady recursion implemented rugantio 2019-01-30 17:21:43 +0100
  • eaaa2a32e3 cleaning up rugantio 2019-01-29 22:27:55 +0100
  • 17430a06a9 fixed datetime parser rugantio 2019-01-29 22:26:01 +0100
  • b06883dc3c fixed gitignore rugantio 2019-01-29 22:08:37 +0100
  • 8d343b057c gitignore added rugantio 2019-01-29 21:57:48 +0100
  • b8d8444b3f fix xpath in comment crawler rugantio 2019-01-26 02:52:26 +0100
  • 80a10f176f fix xpath in comment crawler rugantio 2019-01-26 02:51:18 +0100
  • 0c0f3129cd
    Update README.md Rugantio Costa 2018-12-27 02:20:46 +0100
  • e95c70c844
    Update README.md Rugantio Costa 2018-12-27 01:47:29 +0100
  • c04509499f changed user-agent and fixed date parsers in items.py rugantio 2018-12-13 05:33:09 +0100
  • 30c04c2fca changed user-agent and fixed date parsers in items.py rugantio 2018-12-13 05:31:18 +0100
  • 168bc2c510 disabling pipeline rugantio 2018-11-22 21:22:18 +0100
  • a153b1fa7a
    Update README.md Rugantio Costa 2018-11-18 00:46:28 +0100
  • fdff7b826c
    Update README.md Rugantio Costa 2018-10-22 00:35:56 +0200
  • 30ba7dd125
    new spider is introduced Rugantio Costa 2018-10-22 00:33:20 +0200
  • 2a4ac3a5e2 new spider for comments rugantio 2018-10-22 00:24:57 +0200
  • 728516617f
    Update README.md Rugantio Costa 2018-09-19 01:39:20 +0200
  • 0477d9a780 final rugantio 2018-08-27 02:23:57 +0200
  • ae2b0698f6 final rugantio 2018-08-27 02:23:25 +0200
  • cf6313e4b1 final rugantio 2018-08-27 02:21:51 +0200
  • 8359748b81 final rugantio 2018-08-26 17:09:16 +0200
  • a26cbd969c final rugantio 2018-08-26 15:43:06 +0200
  • 1f1d6775af final rugantio 2018-08-26 15:41:06 +0200
  • a6d39a4a7e final rugantio 2018-08-26 14:12:47 +0200