Commit Graph

138 Commits

Author SHA1 Message Date
Serene-Arc
c5c010bce0 Rename option 2021-06-12 10:35:31 +10:00
Serene-Arc
6eeadc8821 Add option for archiver full context 2021-06-11 15:31:11 +10:00
Serene-Arc
8ba2d0bb55 Add missing return statement 2021-06-10 18:59:22 +10:00
Serene-Arc
8be3efb6e4 Fix bug with Imgur gifs being shortened too much
The rstrip function was used wrongly, it doesn't remove a substring but
rather removes any of the characters provided, so here it removed any I,
G, V, or F that finished the six character ID for Imgur, resulting in a
404 error for the resources in question.
2021-06-08 13:08:39 +10:00
Serene
6dcef83666
Add ability to disable modules (#434)
* Fix test name to match standard

* Rename file

* Add ability to disable modules

* Update README

* Fix missing comma

* Fix more missing commas. sigh...

Co-authored-by: Ali Parlakçı <parlakciali@gmail.com>
2021-06-06 13:47:56 +03:00
Serene
434aeb8feb
Add a combined command for the archiver and downloader: clone (#433)
* Simplify downloader function

* Add basic scraper class

* Add "scrape" command

* Rename "scrape" command to "clone"

* Add integration tests for clone command

* Update README

* Fix failing test
2021-06-06 13:29:09 +03:00
Ali Parlakçı
a2f010c40d
Merge pull request #432 from Serene-Arc/enhancement_429
Allow --user to be specified multiple times
2021-06-06 13:25:00 +03:00
Ali Parlakçı
6839c65bd6
Merge pull request #396 from Serene-Arc/bug_fix_385
Add path limit check
2021-06-06 13:24:22 +03:00
Serene-Arc
79fba4ac4a Fix indent 2021-05-31 13:42:41 +10:00
Serene-Arc
9a1e1ebea1 Add path limit fix 2021-05-27 15:27:02 +10:00
Serene-Arc
6b78a23484 Allow --user to be specified multiple times 2021-05-27 15:22:58 +10:00
Serene-Arc
fef2fc864b Update blacklist 2021-05-25 19:33:32 +10:00
Serene-Arc
87959028e5 Add blacklist for web filetypes 2021-05-25 19:20:51 +10:00
Serene-Arc
f47688812d Rename function 2021-05-25 18:51:24 +10:00
Serene-Arc
323b2d2b03 Fix download retries logic 2021-05-25 09:56:22 +10:00
Serene-Arc
e2582ecb3e Catch error with MacOS writing per issue #407 2021-05-23 12:17:14 +10:00
Serene-Arc
47a4951279 Rename variable 2021-05-23 12:13:44 +10:00
Serene-Arc
4395dd4646 Update logging messages to include submission IDs 2021-05-22 11:53:44 +10:00
Serene-Arc
a104a154fc Simplify method structure 2021-05-22 11:53:44 +10:00
Ali Parlakci
da8c64ec51 Read files in chunks instead when hashing (#416) 2021-05-22 08:46:39 +10:00
Ali Parlakci
cf6905db28
Reverts #384 2021-05-21 00:22:44 +03:00
Ali Parlakçı
bfa6e4da5a
Merge pull request #409 from Ailothaen/master
Adding info to threads and comments: distinguished, spoiler, pinned, locked
2021-05-21 00:04:53 +03:00
Ailothaen
827f1ab80e Adding some more info in threads and comments: distinguished, spoiler, locked, sticky 2021-05-20 18:26:50 +02:00
Serene-Arc
830e4f2830 Catch additional error 2021-05-19 10:09:46 +10:00
Serene-Arc
3b28ad24b3 Fix bug with some Imgur extensions 2021-05-19 09:57:27 +10:00
Ali Parlakci
7c401b1461
Merge branch 'reddit_connector_refactor' of https://github.com/Serene-Arc/bulk-downloader-for-reddit into Serene-Arc-reddit_connector_refactor 2021-05-17 13:53:48 +03:00
Serene
c581bef790
Set file creation times to the post creation time (#391) 2021-05-17 13:49:35 +03:00
Serene-Arc
7016603763 Refactor out super class RedditConnector 2021-05-17 11:50:17 +10:00
Ali Parlakci
200916a150 Rename --exclude-id(-file) to --skip-id(-file) 2021-05-17 10:30:55 +10:00
Ali Parlakci
f768a7d61c Rename --skip to --skip-format 2021-05-17 10:30:55 +10:00
alpbetgam
ef37712115 Fix error with old gfycat/redgifs urls 2021-05-16 19:15:36 +10:00
Ali Parlakci
c7a5ec4376 bug(youtube.dl): Fix crash on zero downloads #375 2021-05-16 11:06:13 +10:00
BlipRanger
fca3184950
Bind socket to '0.0.0.0' rather than 'localhost' to allow for more flexible OAuth connection. (#368) 2021-05-12 17:47:33 +03:00
Serene-Arc
7e70175e4c Change logging message to include submission ID 2021-05-10 19:03:20 +10:00
Ali Parlakçı
a2e22e894a
Fix xml archiver encoding bug (#349)
* test_integration: add archiver tests

* archiver.py: fix encoding bug in xml archiver
2021-05-06 16:11:48 +03:00
Ali Parlakçı
283ad164e5
__main__.py: fix typo in -f argument 2021-05-06 12:52:45 +03:00
Serene-Arc
f6d89097f8 Consolidate exception block 2021-05-06 10:43:56 +10:00
Ali Parlakci
e642ad68d4 youtubedl_fallback.py: add a fallback exception and log messages 2021-05-05 16:56:34 +03:00
Ali Parlakci
00defe3b87 youtubedl_fallback: remove logging the expected exception 2021-05-05 16:35:03 +03:00
Serene-Arc
c9cde54a72 Remove VReddit downloader module 2021-05-03 14:10:54 +10:00
Serene-Arc
ab96a3ba97 Remove Streamable downloader module 2021-05-03 14:10:54 +10:00
Serene-Arc
fba70dcf18 Intercept youtube-dl output 2021-05-03 14:10:54 +10:00
Serene-Arc
a8c2136270 Add fallback downloader 2021-05-03 14:10:54 +10:00
Serene-Arc
afa3e2548f Add customisable time formatting 2021-05-03 14:05:05 +10:00
Serene-Arc
eda12e5274 Make downloadfilter apply itself to Resources 2021-05-03 14:02:03 +10:00
Serene-Arc
711f8b0c76 Add exception for r/all in subreddit check 2021-05-02 14:00:23 +10:00
Serene-Arc
14195157de Catch errors for banned or private subreddits 2021-05-01 13:36:38 +10:00
Daniel Clowry
fe95394b3b Match import order, update docs 2021-04-30 09:22:26 +10:00
Daniel Clowry
e6d2980db3 Add Streamable to download factory 2021-04-30 09:22:26 +10:00
Daniel Clowry
2c54cd740a Add Streamable downloader 2021-04-30 09:22:26 +10:00
Serene-Arc
39935c58d9 Remove GifDeliveryNetwork module 2021-04-28 19:08:31 +10:00
Serene-Arc
9931839d14 Remove gifdeliverynetwork from download factory 2021-04-28 19:08:31 +10:00
Serene-Arc
760e59e1f7 Invert inheritance direction 2021-04-28 19:08:31 +10:00
Serene-Arc
3c6e9f6ccf Refactor class 2021-04-28 19:08:31 +10:00
Ali Parlakci
e1a4ac063c (bug) redgifs: fix could not read page source 2021-04-28 19:08:31 +10:00
Serene-Arc
7fcbf623a0 Catch additional errors in site downloaders 2021-04-28 15:20:02 +10:00
Serene-Arc
6a20548269 Catch additional error 2021-04-28 12:03:28 +10:00
Serene-Arc
17499baf61 Add informative error when testing user existence 2021-04-28 12:03:28 +10:00
Serene-Arc
e6551bb797 Return banned users as not existing 2021-04-28 12:03:28 +10:00
Serene-Arc
db46676dec Catch error when logfile accessed concurrently 2021-04-28 12:03:28 +10:00
Serene-Arc
cb41d4749a Add option to specify logfile location 2021-04-28 12:03:28 +10:00
Serene-Arc
a28c2d3c73 Add missing default argument 2021-04-28 12:03:28 +10:00
Serene-Arc
f5d11107a7 Remove unused imports 2021-04-28 12:03:28 +10:00
Serene-Arc
8cdf926211 Rename function 2021-04-28 12:03:28 +10:00
Serene-Arc
7438543f49 Remove unused variable 2021-04-28 12:03:28 +10:00
Serene-Arc
ca495a6677 Add missing typing declaration 2021-04-28 12:03:28 +10:00
Serene-Arc
214c883a10 Simplify regex string slightly 2021-04-28 12:03:28 +10:00
Serene-Arc
6767777944 Catch requests errors in site downloaders 2021-04-28 12:03:28 +10:00
Serene-Arc
d960bc0b7b Use ISO format for timestamps in names 2021-04-28 12:03:28 +10:00
Ali Parlakçı
2eab4052c5
Fix GifDeliveryNetwork link algorithm (#298)
* Catch additional error when parsing site

* Fix GifDeliveryNetwork link algorithm

Co-authored-by: Serene-Arc <serenical@gmail.com>
2021-04-22 10:07:05 +03:00
Ali Parlakci
1c4cfbb580 tests: move outside of the package 2021-04-21 08:29:14 +10:00
Serene
b37ff0714f Fix time filters (#279) 2021-04-18 16:44:52 +03:00
Ali Parlakci
f483f24e15 test_integration.py: fix skipif test_config 2021-04-18 16:44:52 +03:00
Ali Parlakçı
5e81160e5f test_vreddit: remove flaky test (#272) 2021-04-18 16:44:52 +03:00
Ali Parlakci
8eb374eec6 test_vreddit: fix incorrect file hash 2021-04-18 16:44:52 +03:00
Serene
d8752b15fa Add option to skip specified subreddits (#268)
* Rename variables

* Add option to skip specific subreddits

* Update README
2021-04-18 16:44:52 +03:00
Serene-Arc
48dca9e5ee Fix mistaken backreference in some titles
This should resolve #267
2021-04-18 16:44:52 +03:00
Nathan Spicer-Davis
59ab5d8777 Update extension regex to match URI fragments (#264) 2021-04-18 16:44:52 +03:00
Serene-Arc
ab7a0f6a51 Catch errors when resources have no extension
This is related to #266 and will prevent the BDFR from
completely crashing when a file extension is unknown
2021-04-18 16:44:52 +03:00
Serene-Arc
e672e28a12 Fix typing on function 2021-04-18 16:44:52 +03:00
Serene-Arc
4b195f2b53 Remove unneeded logger entry 2021-04-18 16:44:52 +03:00
Serene
62dedb6c95 Fix bug with emojis in the filename (#263) 2021-04-18 16:44:52 +03:00
Serene-Arc
5758aad48b Fix formatting 2021-04-18 16:44:52 +03:00
Serene-Arc
77bdbbac63 Update test hash 2021-04-18 16:44:52 +03:00
Serene-Arc
7d71f8ffab Add regex for all 2xx HTTP codes 2021-04-18 16:44:52 +03:00
Serene-Arc
308853d531 Use standards in HTTP errors 2021-04-18 16:44:52 +03:00
Serene-Arc
e35dd9e5d0 Fix bug in file name formatter 2021-04-18 16:44:52 +03:00
Serene-Arc
bd9f276acc Rename module 2021-04-18 16:44:52 +03:00