The Access Control List (ACL) used when storing items to Google Cloud Storage. For more information on how to set this value, please refer to the column JSON API in Google Cloud documentation. The Project ID that will be used when storing data on Google Cloud Storage. A dict containing item pipelines to use, and their orders. Order values are arbitrary, but it is customary to define them in the 0-1000 range. Lower orders process before higher orders.

File name to use for logging output. If None, standard error will be used. Refer to the Python logging documentation for the whole list of available placeholders. Refer to the Python datetime documentation for the whole list of available directives. The class to use for formatting log messages for different actions.

Minimum level to log. Available levels are: CRITICAL, ERROR, WARNING, INFO, DEBUG. For more info see Logging. If True, all standard output (and error) of your process will be redirected to the log. For example if you print('hello') it will appear in the Scrapy log.

If True, the logs will not contain the root path. If it is set to False then it displays the component responsible for the log output. The interval (in seconds) between each logging printout of the stats by LogStats.

When memory debugging is enabled a memory report will be sent to the specified addresses if this setting is not empty, otherwise the report will be written to the log. This extension keeps track of a peak memory used by the process (it writes it to stats).

See Memory usage extension. If zero, no check will be performed. If zero, no warning will be produced. Module where to create new spiders using the genspider command. This randomization decreases the chance of the crawler being detected (and subsequently blocked) by sites which analyze requests looking for statistically significant similarities in the time between their requests.

The randomization policy is the same used by wget --random-wait option. The maximum limit for Twisted Reactor thread pool size. This is common multi-purpose thread pool used by various Scrapy components.

Threaded DNS Resolver, BlockingFeedStorage, S3FilesStore just to name a few. For more information see RobotsTxtMiddleware. While the default value is False for historical reasons, this option is enabled by default in settings. The parser backend to use for parsing robots.



