Downloads and archives content from reddit

archive downloader gfycat imgur python reddit scraper

Go to file

Ali Parlakci c7b7361ded Update version		2018-07-25 18:27:58 +03:00
docs	Added verbose mode	2018-07-25 13:40:06 +03:00
src	Stylize	2018-07-25 13:48:30 +03:00
.gitignore	Excludes build folders	2018-07-13 14:11:41 +03:00
LICENSE	Initial commit	2018-07-09 22:58:11 +03:00
README.md	Update changelog	2018-07-25 13:53:33 +03:00
requirements.txt	Added requirements.txt	2018-07-11 23:55:03 +03:00
script.py	Update version	2018-07-25 18:27:58 +03:00
setup.py	Added .exe to executable's extension	2018-07-13 14:12:17 +03:00

README.md

Bulk Downloader for Reddit

This program downloads imgur, gfycat and direct image and video links of saved posts from a reddit account. It is written in Python 3.

PLEASE post any issue you have with the script to Issues tab. Since I don't have any testers or contributers I need your feedback.

What it can do

Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
Sorts posts by hot, top, new and so on
Downloads REDDIT images and videos, IMGUR images and albums, GFYCAT links, EROME images and albums, SELF POSTS and any link to a DIRECT IMAGE
Skips the existing ones
Puts post title and OP's name in file's name
Puts every post to its subreddit's folder
Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
Logs failed ones in a file to so that you can try to download them later

Download the latest release

How it works

For Windows and Linux users, there are executable files to run easily without installing a third party program. But if you are a paranoid like me, you can compile it from source code.
- In Windows, double click on bulk-downloader-for-reddit file
- In Linux, extract files to a folder and open terminal inside it. Type ./bulk-downloader-for-reddit
MacOS users have to compile it from source code.

Script also accepts command-line arguments, get further information from --help

Setting up the script

Because this is not a commercial app, you need to create an imgur developer app in order API to work.

Creating an imgur app

Go to https://api.imgur.com/oauth2/addclient
Enter a name into the Application Name field.
Pick Anonymous usage without user authorization as an Authorization type*
Enter your email into the Email field.
Correct CHAPTCHA
Click submit button

It should redirect to a page which shows your imgur_client_id and imgur_client_secret

* Select OAuth 2 authorization without a callback URL first then select Anonymous usage without user authorization if it says Authorization callback URL: required

FAQ

What do the dots resemble when getting posts?

Each dot means that 100 posts are scanned.

Getting posts is taking too long.

You can press Ctrl+C to interrupt it and start downloading.

How downloaded files' names are formatted?

Images that are not belong to an album or self posts are formatted as [SUBMITTER NAME]_[POST TITLE]_[REDDIT ID]. You can use reddit id to go to post's reddit page by going to link reddit.com/[REDDIT ID]
An image in an imgur album is formatted as [ITEM NUMBER]_[IMAGE TITLE]_[IMGUR ID] Similarly, you can use imgur id to go to image's imgur page by going to link imgur.com/[IMGUR ID].

How do I open self post files?

Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings. However, there is a great Chrome extension for viewing Markdown files with its styling. Install it and open the files with Chrome.

However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS

How can I change my credentials?

All of the user data is held in config.json file which is in a folder named "Bulk Downloader for Reddit" in your Home directory. You can edit them, there.