Releases: scrapy/scrapy
2.4.0
Hihglights:
-
Python 3.5 support has been dropped.
-
The
file_path
method of media pipelines can now access the source item.This allows you to set a download file path based on item data.
-
The new
item_export_kwargs
key of theFEEDS
setting allows to define keyword parameters to pass to item exporter classes. -
You can now choose whether feed exports overwrite or append to the output file.
For example, when using the
crawl
orrunspider
commands, you can use the-O
option instead of-o
to overwrite the output file. -
Zstd-compressed responses are now supported if zstandard is installed.
-
In settings, where the import path of a class is required, it is now possible to pass a class object instead.
2.3.0
Hihglights:
-
Feed exports now support Google Cloud Storage as a storage backend
-
The new
FEED_EXPORT_BATCH_ITEM_COUNT
setting allows to deliver output items in batches of up to the specified number of items.It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS).
-
The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule
2.2.1
The startproject
command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions.
2.2.0
Highlights:
- Python 3.5.2+ is required now
- dataclass objects and attrs objects are now valid item types
- New
TextResponse.json
method - New
bytes_received
signal that allows canceling response download CookiesMiddleware
fixes
2.1.0
Highlights:
- New
FEEDS
setting to export to multiple feeds - New
Response.ip_address
attribute
2.0.1
2.0.0
Highlights:
- Python 2 support has been removed
- Partial coroutine syntax support and experimental asyncio support
- New Response.follow_all method
- FTP support for media pipelines
- New Response.certificate attribute
- IPv6 support through DNS_RESOLVER