bulk-downloader-for-reddit/README.md

# Bulk Downloader for Reddit v2-beta

This is a tool to download submissions or submission data from Reddit. It can be used to archive data or even crawl Reddit to gather research data. The BDFR is flexible and can be used in scripts if needed through an extensive command-line interface.

Some quick reference commands are:

  - `python3 -m bulkredditdownloader download --subreddit Python -L 10`
  - `python3 -m bulkredditdownloader download --user me --saved --authenticate -L 25 --file-scheme '{POSTID}'`
  - `python3 -m bulkredditdownloader download --subreddit 'Python, all, mindustry' -L 10 --make-hard-links`
  - `python3 -m bulkredditdownloader archive --subreddit all --format yaml -L 500 --folder-scheme ''`

## Usage

The BDFR works by taking submissions from a variety of "sources" from Reddit and then parsing them to download. These sources might be a subreddit, multireddit, a user list, or individual links. These sources are combined and downloaded to disk, according to a naming and organisational scheme defined by the user.

There are two modes to the BDFR: download, and archive. Each one has a command that performs similar but distinct functions. The `download` command will download the resource linked in the Reddit submission, such as the images, video, etc. The `archive` command will download the submission data itself and store it, such as the submission details, upvotes, text, statistics, as and all the comments on that submission. These can then be saved in a data markup language form, such as JSON, XML, or YAML.

Many websites and links are supported for the downloader:

  - Direct links (links leading to a file)
  - Erome
  - Gfycat
  - Gif Delivery Network
  - Imgur
  - Reddit Galleries
  - Reddit Text Posts
  - Reddit Videos
  - Redgifs
  - Youtube

### Options

The following options are common between both the `archive` and `download` commands of the BDFR.

- `directory`
  - This is the directory to which the BDFR will download and place all files
- `--authenticate`
  - This flag will make the BDFR attempt to use an authenticated Reddit session
  - See [Authentication](#authentication) for more details
- `--config`
  - If the path to a configuration file is supplied with this option, the BDFR will use the specified config
  - See [Configuration Files](#configuration) for more details
- `--saved`
  - This option will make the BDFR use the supplied user's saved posts list as a download source
  - This requires an authenticated Reddit instance, using the `--authenticate` flag, as well as `--user` set to `me`
- `--search`
  - This will apply the specified search term to specific lists when scraping submissions
  - A search term can only be applied to subreddits and multireddits, supplied with the `- s` and `-m` flags respectively
- `--submitted`
  - This will use a user's submissions as a source
  - A user must be specified with `--user`
- `--upvoted`
  - This will use a user's upvoted posts as a source of posts to scrape
  - This requires an authenticated Reddit instance, using the `--authenticate` flag, as well as `--user` set to `me`
- `-L, --limit`
  - This is the limit on the number of submissions retrieve
  - Default is max possible
  - Note that this limit applies to **each source individually** e.g. if a `--limit` of 10 and three subreddits are provided, then 30 total submissions will be scraped
  - If it is not supplied, then the BDFR will default to the maximum allowed by Reddit, roughly 1000 posts. **We cannot bypass this.**
- `-S, --sort`
  - This is the sort type for each applicable submission source supplied to the BDFR
  - This option does not apply to upvoted or saved posts when scraping from these sources
  - The following options are available:
    - `controversial`
    - `hot` (default)
    - `new`
    - `relevance` (only available when using `--search`)
    - `rising`
    - `top`
- `-l, --link`
  - This is a direct link to a submission to download, either as a URL or an ID
  - Can be specified multiple times
- `-m, --multireddit`
  - This is the name of a multireddit to add as a source
  - Can be specified multiple times
      - This can be done by using `-m` multiple times
      - Multireddits can also be used to provide CSV multireddits e.g. `-m 'chess, favourites'`
  - The specified multireddits must all belong to the user specified with the `--user` option
- `-s, --subreddit`
  - This adds a subreddit as a source
  - Can be used mutliple times
    - This can be done by using `-s` multiple times
    - Subreddits can also be used to provide CSV subreddits e.g. `-m 'all, python, mindustry'`
- `-t, --time`
  - This is the time filter that will be applied to all applicable sources
  - This option does not apply to upvoted or saved posts when scraping from these sources
  - The following options are available:
    - `all` (default)
    - `hour`
    - `day`
    - `week`
    - `month`
    - `year`
- `-u, --user`
  - This specifies the user to scrape in concert with other options
  - When using `--authenticate`, `--user me` can be used to refer to the authenticated user
- `-v, --verbose`
  - Increases the verbosity of the program
  - Can be specified multiple times

#### Downloader Options

The following options apply only to the `download` command. This command downloads the files and resources linked to in the submission, or a text submission itself, to the disk in the specified directory.

- `--exclude-id`
  - This will skip the download of any submission with the ID provided
  - Can be specified multiple times
- `--exclude-id-file`
  - This will skip the download of any submission with any of the IDs in the files provided
  - Can be specified multiple times
  - Format is one ID per line
- `--make-hard-links`
  - This flag will create hard links to an existing file when a duplicate is downloaded
  - This will make the file appear in multiple directories while only taking the space of a single instance
- `--no-dupes`
  - This flag will not redownload files if they already exist somewhere in the root folder tree
  - This is calculated by MD5 hash
- `--search-existing`
  - This will make the BDFR compile the hashes for every file in `directory` and store them to remove duplicates if `--no-dupes` is also supplied
- `--file-scheme`
  - Sets the scheme for files
  - Default is `{REDDITOR}_{TITLE}_{POSTID}`
  - See [Folder and File Name Schemes](#folder-and-file-name-schemes) for more details
- `--folder-scheme`
  - Sets the scheme for folders
  - Default is `{SUBREDDIT}`
  - See [Folder and File Name Schemes](#folder-and-file-name-schemes) for more details
- `--skip-domain`
  - This adds domains to the download filter i.e. submissions coming from these domains will not be downloaded
  - Can be specified multiple times
- `--skip`
  - This adds file types to the download filter i.e. submissions with one of the supplied file extensions will not be downloaded
  - Can be specified multiple times

#### Archiver Options

The following options are for the `archive` command specifically.

- `--all-comments`
  - When combined with the `--user` option, this will download all the user's comments
- `-f, --format`
  - This specifies the format of the data file saved to disk
  - The following formats are available:
    - `json` (default)
    - `xml`
    - `yaml`

## Authentication and Secuirity

The BDFR uses OAuth2 authentication to connect to Reddit if authentication is required. This means that it is a secure, token-based system for making requests. This also means that the BDFR only has access to specific parts of the account authenticated, by default only saved posts, upvoted posts, and the identity of the authenticated account. Note that authentication is not required unless accessing private things like upvoted posts, saved posts, and private multireddits.

To authenticate, the BDFR will first look for a token in the configuration file that signals that there's been a previous authentication. If this is not there, then the BDFR will attempt to register itself with your account. This is normal, and if you run the program, it will pause and show a Reddit URL. Click on this URL and it will take you to Reddit, where the permissions being requested will be shown. Read this and **confirm that there are no more permissions than needed to run the program**. You should not grant unneeded permissions; by default, the BDFR only requests permission to read your saved or upvoted submissions and identify as you.

If the permissions look safe, confirm it, and the BDFR will save a token that will allow it to authenticate with Reddit from then on.

## Changing Permissions

Most users will not need to do anything extra to use any of the current features. However, if additional features such as scraping messages, PMs, etc are added in the future, these will require additional scopes. Additionally, advanced users may wish to use the BDFR with their own API key and secret. There is normally no need to do this, but it *is* allowed by the BDFR.

The configuration file for the BDFR contains the API secret and key, as well as the scopes that the BDFR will request when registering itself to a Reddit account via OAuth2. These can all be changed if the user wishes, however do not do so if you don't know what you are doing. The defaults are specifically chosen to have a very low security risk if your token were to be compromised, however unlikely that actually is. Never grant more permissions than you absolutely need.

For more details on the configuration file and the values therein, see [Configuration Files](#configuration).

## Folder and File Name Schemes

The naming and folder schemes for the BDFR are both completely customisable. A number of different fields can be given which will be replaced with properties from a submission when downloading it. The scheme format takes the form of `{KEY}`, where `KEY` is a string from the below list.

  - `DATE`
  - `FLAIR`
  - `POSTID`
  - `REDDITOR`
  - `SUBREDDIT`
  - `TITLE`
  - `UPVOTES`

Each of these can be enclosed in curly bracket, `{}`, and included in the name. For example, to just title every downloaded post with the unique submission ID, you can use `{POSTID}`. Static strings can also be included, such as `download_{POSTID}` which will not change from submission to submission. For example, the previous string will result in the following submission file names:

  - `download_aaaaaa.png`
  - `download_bbbbbb.png`

At least one key *must* be included in the file scheme, otherwise an error will be thrown. The folder scheme however, can be null or a simple static string. In the former case, all files will be placed in the folder specified with the `directory` argument. If the folder scheme is a static string, then all submissions will be placed in a folder of that name. In both cases, there will be no separation between all submissions.

It is highly recommended that the file name scheme contain the parameter `{POSTID}` as this is **the only parameter guaranteed to be unique**. No combination of other keys will necessarily be unique and may result in posts being skipped as the BDFR will see files by the same name and skip the download, assuming that they are already downloaded.

## Configuration

The configuration files are, by default, stored in the configuration directory for the user. This differs depending on the OS that the BDFR is being run on. For Windows, this will be:
  - `C:\Documents and Settings\<User>\Application Data\Local Settings\BDFR\bulkredditdownloader` or
  - `C:\Documents and Settings\<User>\Application Data\BDFR\bulkredditdownloader`

On Mac OSX, this will be:
  - `~/Library/Application Support/bulkredditdownloader`. 
    
Lastly, on a Linux system, this will be:
  - `~/.local/share/bulkredditdownloader`

The logging output for each run of the BDFR will be saved to this directory in the file `log_output.txt`. If you need to submit a bug, it is this file that you will need to submit with the report.

### Configuration File

The `config.cfg` is the file that supplies the BDFR with the configuration to use. At the moment, the following keys **must** be included in the configuration file supplied.

  - `backup_log_count`
  - `client_id`
  - `client_secret`
  - `scopes`

All of these should not be modified unless you know what you're doing, as the default values will enable the BDFR to function just fine. A configuration is included in the BDFR when it is installed, and this will be placed in the configuration directory as the default.

Most of these values have to do with OAuth2 configuration and authorisation. The key `backup_log_count` however has to do with the log rollover. The logs in the configuration directory can be verbose and for long runs of the BDFR, can grow quite large. To combat this, the BDFR will overwrite previous logs. This value determines how many previous run logs will be kept. The default is 3, which means that the BDFR will keep at most three past logs plus the current one. Any runs past this will overwrite the oldest log file, called "rolling over". If you want more records of past runs, increase this number.

## Contributing

If you wish to contribute, see [Contributing](docs/CONTRIBUTING.md) for more information.

When reporting any issues or interacting with the developers, please follow the [Code of Conduct](docs/CODE_OF_CONDUCT.md).
Add v2-beta to title 2021-03-22 03:40:45 +08:00			`# Bulk Downloader for Reddit v2-beta`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Update documentation 2021-03-16 08:01:51 +08:00			`This is a tool to download submissions or submission data from Reddit. It can be used to archive data or even crawl Reddit to gather research data. The BDFR is flexible and can be used in scripts if needed through an extensive command-line interface.`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Add some things to README 2021-03-21 17:16:36 +08:00			`Some quick reference commands are:`

			- `python3 -m bulkredditdownloader download --subreddit Python -L 10`
			- `python3 -m bulkredditdownloader download --user me --saved --authenticate -L 25 --file-scheme '{POSTID}'`
			- `python3 -m bulkredditdownloader download --subreddit 'Python, all, mindustry' -L 10 --make-hard-links`
			- `python3 -m bulkredditdownloader archive --subreddit all --format yaml -L 500 --folder-scheme ''`

Update documentation 2021-03-16 08:01:51 +08:00			`## Usage`
Start rewrite of README 2021-03-13 12:27:23 +08:00
			`The BDFR works by taking submissions from a variety of "sources" from Reddit and then parsing them to download. These sources might be a subreddit, multireddit, a user list, or individual links. These sources are combined and downloaded to disk, according to a naming and organisational scheme defined by the user.`

Fix typos 2021-03-22 09:09:00 +08:00			There are two modes to the BDFR: download, and archive. Each one has a command that performs similar but distinct functions. The `download` command will download the resource linked in the Reddit submission, such as the images, video, etc. The `archive` command will download the submission data itself and store it, such as the submission details, upvotes, text, statistics, as and all the comments on that submission. These can then be saved in a data markup language form, such as JSON, XML, or YAML.
Start rewrite of README 2021-03-13 12:27:23 +08:00
Add more to README 2021-03-14 09:49:47 +08:00			`Many websites and links are supported for the downloader:`

Fix typos 2021-03-22 09:09:00 +08:00			`- Direct links (links leading to a file)`
Start rewrite of README 2021-03-13 12:27:23 +08:00			`- Erome`
			`- Gfycat`
			`- Gif Delivery Network`
			`- Imgur`
			`- Reddit Galleries`
			`- Reddit Text Posts`
			`- Reddit Videos`
			`- Redgifs`
			`- Youtube`

Update documentation 2021-03-16 08:01:51 +08:00			`### Options`
Start rewrite of README 2021-03-13 12:27:23 +08:00
			The following options are common between both the `archive` and `download` commands of the BDFR.

			- `directory`
			`- This is the directory to which the BDFR will download and place all files`
			- `--authenticate`
			`- This flag will make the BDFR attempt to use an authenticated Reddit session`
Fix typos 2021-03-22 09:09:00 +08:00			`- See [Authentication](#authentication) for more details`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `--config`
			`- If the path to a configuration file is supplied with this option, the BDFR will use the specified config`
Fix typos 2021-03-22 09:09:00 +08:00			`- See [Configuration Files](#configuration) for more details`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `--saved`
			`- This option will make the BDFR use the supplied user's saved posts list as a download source`
			- This requires an authenticated Reddit instance, using the `--authenticate` flag, as well as `--user` set to `me`
			- `--search`
			`- This will apply the specified search term to specific lists when scraping submissions`
			- A search term can only be applied to subreddits and multireddits, supplied with the `- s` and `-m` flags respectively
			- `--submitted`
			`- This will use a user's submissions as a source`
			- A user must be specified with `--user`
			- `--upvoted`
			`- This will use a user's upvoted posts as a source of posts to scrape`
			- This requires an authenticated Reddit instance, using the `--authenticate` flag, as well as `--user` set to `me`
			- `-L, --limit`
			`- This is the limit on the number of submissions retrieve`
Add more to README 2021-03-14 09:49:47 +08:00			`- Default is max possible`
Update documentation 2021-03-16 08:01:51 +08:00			- Note that this limit applies to each source individually e.g. if a `--limit` of 10 and three subreddits are provided, then 30 total submissions will be scraped
Start rewrite of README 2021-03-13 12:27:23 +08:00			`- If it is not supplied, then the BDFR will default to the maximum allowed by Reddit, roughly 1000 posts. We cannot bypass this.`
			- `-S, --sort`
			`- This is the sort type for each applicable submission source supplied to the BDFR`
			`- This option does not apply to upvoted or saved posts when scraping from these sources`
			`- The following options are available:`
			- `controversial`
Add more to README 2021-03-14 09:49:47 +08:00			- `hot` (default)
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `new`
			- `relevance` (only available when using `--search`)
			- `rising`
			- `top`
			- `-l, --link`
			`- This is a direct link to a submission to download, either as a URL or an ID`
			`- Can be specified multiple times`
			- `-m, --multireddit`
			`- This is the name of a multireddit to add as a source`
			`- Can be specified multiple times`
Add some things to README 2021-03-21 17:16:36 +08:00			- This can be done by using `-m` multiple times
Fix typos 2021-03-22 09:09:00 +08:00			- Multireddits can also be used to provide CSV multireddits e.g. `-m 'chess, favourites'`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- The specified multireddits must all belong to the user specified with the `--user` option
			- `-s, --subreddit`
			`- This adds a subreddit as a source`
			`- Can be used mutliple times`
Add some things to README 2021-03-21 17:16:36 +08:00			- This can be done by using `-s` multiple times
Fix typos 2021-03-22 09:09:00 +08:00			- Subreddits can also be used to provide CSV subreddits e.g. `-m 'all, python, mindustry'`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `-t, --time`
			`- This is the time filter that will be applied to all applicable sources`
			`- This option does not apply to upvoted or saved posts when scraping from these sources`
			`- The following options are available:`
Add more to README 2021-03-14 09:49:47 +08:00			- `all` (default)
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `hour`
			- `day`
			- `week`
			- `month`
			- `year`
			- `-u, --user`
			`- This specifies the user to scrape in concert with other options`
			- When using `--authenticate`, `--user me` can be used to refer to the authenticated user
			- `-v, --verbose`
			`- Increases the verbosity of the program`
			`- Can be specified multiple times`

Update documentation 2021-03-16 08:01:51 +08:00			`#### Downloader Options`
Start rewrite of README 2021-03-13 12:27:23 +08:00
			The following options apply only to the `download` command. This command downloads the files and resources linked to in the submission, or a text submission itself, to the disk in the specified directory.

Add option to exclude submission IDs (#220) * Add option to exclude submission IDs * Update README * Update logging message 2021-03-27 15:58:43 +08:00			- `--exclude-id`
			`- This will skip the download of any submission with the ID provided`
			`- Can be specified multiple times`
			- `--exclude-id-file`
			`- This will skip the download of any submission with any of the IDs in the files provided`
			`- Can be specified multiple times`
			`- Format is one ID per line`
Update README 2021-03-20 12:01:36 +08:00			- `--make-hard-links`
			`- This flag will create hard links to an existing file when a duplicate is downloaded`
			`- This will make the file appear in multiple directories while only taking the space of a single instance`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `--no-dupes`
Add more to README 2021-03-14 09:49:47 +08:00			`- This flag will not redownload files if they already exist somewhere in the root folder tree`
Start rewrite of README 2021-03-13 12:27:23 +08:00			`- This is calculated by MD5 hash`
			- `--search-existing`
			- This will make the BDFR compile the hashes for every file in `directory` and store them to remove duplicates if `--no-dupes` is also supplied
Rename options 2021-03-20 12:05:07 +08:00			- `--file-scheme`
Start rewrite of README 2021-03-13 12:27:23 +08:00			`- Sets the scheme for files`
Add more to README 2021-03-14 09:49:47 +08:00			- Default is `{REDDITOR}_{TITLE}_{POSTID}`
Fix typos 2021-03-22 09:09:00 +08:00			`- See [Folder and File Name Schemes](#folder-and-file-name-schemes) for more details`
Rename options 2021-03-20 12:05:07 +08:00			- `--folder-scheme`
Start rewrite of README 2021-03-13 12:27:23 +08:00			`- Sets the scheme for folders`
Add more to README 2021-03-14 09:49:47 +08:00			- Default is `{SUBREDDIT}`
Fix typos 2021-03-22 09:09:00 +08:00			`- See [Folder and File Name Schemes](#folder-and-file-name-schemes) for more details`
Start rewrite of README 2021-03-13 12:27:23 +08:00			- `--skip-domain`
			`- This adds domains to the download filter i.e. submissions coming from these domains will not be downloaded`
			`- Can be specified multiple times`
			- `--skip`
			`- This adds file types to the download filter i.e. submissions with one of the supplied file extensions will not be downloaded`
			`- Can be specified multiple times`

Update documentation 2021-03-16 08:01:51 +08:00			`#### Archiver Options`
Add more to README 2021-03-14 09:49:47 +08:00
			The following options are for the `archive` command specifically.

Add option to download user comments (#258) * Add option to download user comments * Update README 2021-04-09 21:15:45 +08:00			- `--all-comments`
			- When combined with the `--user` option, this will download all the user's comments
Add more to README 2021-03-14 09:49:47 +08:00			- `-f, --format`
			`- This specifies the format of the data file saved to disk`
			`- The following formats are available:`
			- `json` (default)
			- `xml`
			- `yaml`

Fix typos 2021-03-22 09:09:00 +08:00			`## Authentication and Secuirity`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			`The BDFR uses OAuth2 authentication to connect to Reddit if authentication is required. This means that it is a secure, token-based system for making requests. This also means that the BDFR only has access to specific parts of the account authenticated, by default only saved posts, upvoted posts, and the identity of the authenticated account. Note that authentication is not required unless accessing private things like upvoted posts, saved posts, and private multireddits.`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			To authenticate, the BDFR will first look for a token in the configuration file that signals that there's been a previous authentication. If this is not there, then the BDFR will attempt to register itself with your account. This is normal, and if you run the program, it will pause and show a Reddit URL. Click on this URL and it will take you to Reddit, where the permissions being requested will be shown. Read this and confirm that there are no more permissions than needed to run the program. You should not grant unneeded permissions; by default, the BDFR only requests permission to read your saved or upvoted submissions and identify as you.

			`If the permissions look safe, confirm it, and the BDFR will save a token that will allow it to authenticate with Reddit from then on.`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Update documentation 2021-03-16 08:01:51 +08:00			`## Changing Permissions`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			`Most users will not need to do anything extra to use any of the current features. However, if additional features such as scraping messages, PMs, etc are added in the future, these will require additional scopes. Additionally, advanced users may wish to use the BDFR with their own API key and secret. There is normally no need to do this, but it is allowed by the BDFR.`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			`The configuration file for the BDFR contains the API secret and key, as well as the scopes that the BDFR will request when registering itself to a Reddit account via OAuth2. These can all be changed if the user wishes, however do not do so if you don't know what you are doing. The defaults are specifically chosen to have a very low security risk if your token were to be compromised, however unlikely that actually is. Never grant more permissions than you absolutely need.`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			`For more details on the configuration file and the values therein, see [Configuration Files](#configuration).`
Add more to README 2021-03-14 09:49:47 +08:00
Update documentation 2021-03-16 08:01:51 +08:00			`## Folder and File Name Schemes`
Add more to README 2021-03-14 09:49:47 +08:00
			The naming and folder schemes for the BDFR are both completely customisable. A number of different fields can be given which will be replaced with properties from a submission when downloading it. The scheme format takes the form of `{KEY}`, where `KEY` is a string from the below list.

			- `DATE`
			- `FLAIR`
			- `POSTID`
			- `REDDITOR`
			- `SUBREDDIT`
			- `TITLE`
			- `UPVOTES`

Fix typos 2021-03-22 09:09:00 +08:00			Each of these can be enclosed in curly bracket, `{}`, and included in the name. For example, to just title every downloaded post with the unique submission ID, you can use `{POSTID}`. Static strings can also be included, such as `download_{POSTID}` which will not change from submission to submission. For example, the previous string will result in the following submission file names:

			- `download_aaaaaa.png`
			- `download_bbbbbb.png`
Add more to README 2021-03-14 09:49:47 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			At least one key must be included in the file scheme, otherwise an error will be thrown. The folder scheme however, can be null or a simple static string. In the former case, all files will be placed in the folder specified with the `directory` argument. If the folder scheme is a static string, then all submissions will be placed in a folder of that name. In both cases, there will be no separation between all submissions.
Add more to README 2021-03-14 09:49:47 +08:00
Add warning for non-unique file name schemes (#233) * Add warning for non-unique file name schemes * Update README with warning 2021-03-30 15:20:05 +08:00			It is highly recommended that the file name scheme contain the parameter `{POSTID}` as this is the only parameter guaranteed to be unique. No combination of other keys will necessarily be unique and may result in posts being skipped as the BDFR will see files by the same name and skip the download, assuming that they are already downloaded.

Update documentation 2021-03-16 08:01:51 +08:00			`## Configuration`
Add more to README 2021-03-14 09:49:47 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			`The configuration files are, by default, stored in the configuration directory for the user. This differs depending on the OS that the BDFR is being run on. For Windows, this will be:`
			- `C:\Documents and Settings\<User>\Application Data\Local Settings\BDFR\bulkredditdownloader` or
			- `C:\Documents and Settings\<User>\Application Data\BDFR\bulkredditdownloader`

			`On Mac OSX, this will be:`
			- `~/Library/Application Support/bulkredditdownloader`.

			`Lastly, on a Linux system, this will be:`
			- `~/.local/share/bulkredditdownloader`
Add more to README 2021-03-14 09:49:47 +08:00
			The logging output for each run of the BDFR will be saved to this directory in the file `log_output.txt`. If you need to submit a bug, it is this file that you will need to submit with the report.

Update documentation 2021-03-16 08:01:51 +08:00			`### Configuration File`
Add more to README 2021-03-14 09:49:47 +08:00
Fix typos 2021-03-22 09:09:00 +08:00			The `config.cfg` is the file that supplies the BDFR with the configuration to use. At the moment, the following keys must be included in the configuration file supplied.
Start rewrite of README 2021-03-13 12:27:23 +08:00
Set log backup count from config 2021-04-06 14:16:26 +08:00			- `backup_log_count`
Add more to README 2021-03-14 09:49:47 +08:00			- `client_id`
			- `client_secret`
			- `scopes`
Start rewrite of README 2021-03-13 12:27:23 +08:00
Add more to README 2021-03-14 09:49:47 +08:00			`All of these should not be modified unless you know what you're doing, as the default values will enable the BDFR to function just fine. A configuration is included in the BDFR when it is installed, and this will be placed in the configuration directory as the default.`
Update documentation 2021-03-16 08:01:51 +08:00
Set log backup count from config 2021-04-06 14:16:26 +08:00			Most of these values have to do with OAuth2 configuration and authorisation. The key `backup_log_count` however has to do with the log rollover. The logs in the configuration directory can be verbose and for long runs of the BDFR, can grow quite large. To combat this, the BDFR will overwrite previous logs. This value determines how many previous run logs will be kept. The default is 3, which means that the BDFR will keep at most three past logs plus the current one. Any runs past this will overwrite the oldest log file, called "rolling over". If you want more records of past runs, increase this number.

Update documentation 2021-03-16 08:01:51 +08:00			`## Contributing`

			`If you wish to contribute, see [Contributing](docs/CONTRIBUTING.md) for more information.`

			`When reporting any issues or interacting with the developers, please follow the [Code of Conduct](docs/CODE_OF_CONDUCT.md).`