Cover Scrapy 1.8.0 in the release notes (#3952)

scrapy · Oct 29, 2019 · 94f060f · 94f060f
1 parent 18b808b
commit 94f060f
Show file tree

Hide file tree

Showing 3 changed files with 229 additions and 4 deletions.
diff --git a/docs/news.rst b/docs/news.rst
@@ -6,6 +6,209 @@ Release notes
 .. note:: Scrapy 1.x will be the last series supporting Python 2. Scrapy 2.0,
           planned for Q4 2019 or Q1 2020, will support **Python 3 only**.
 
+.. _release-1.8.0:
+
+Scrapy 1.8.0 (2019-10-28)
+-------------------------
+
+Highlights:
+
+* Dropped Python 3.4 support and updated minimum requirements; made Python 3.8
+  support official
+* New :meth:`Request.from_curl <scrapy.http.Request.from_curl>` class method
+* New :setting:`ROBOTSTXT_PARSER` and :setting:`ROBOTSTXT_USER_AGENT` settings
+* New :setting:`DOWNLOADER_CLIENT_TLS_CIPHERS` and
+  :setting:`DOWNLOADER_CLIENT_TLS_VERBOSE_LOGGING` settings
+
+Backward-incompatible changes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*   Python 3.4 is no longer supported, and some of the minimum requirements of
+    Scrapy have also changed:
+
+    *   cssselect_ 0.9.1
+    *   cryptography_ 2.0
+    *   lxml_ 3.5.0
+    *   pyOpenSSL_ 16.2.0
+    *   queuelib_ 1.4.2
+    *   service_identity_ 16.0.0
+    *   six_ 1.10.0
+    *   Twisted_ 17.9.0 (16.0.0 with Python 2)
+    *   zope.interface_ 4.1.3
+
+    (:issue:`3892`)
+
+*   ``JSONRequest`` is now called :class:`~scrapy.http.JsonRequest` for
+    consistency with similar classes (:issue:`3929`, :issue:`3982`)
+
+*   If you are using a custom context factory
+    (:setting:`DOWNLOADER_CLIENTCONTEXTFACTORY`), its ``__init__`` method must
+    accept two new parameters: ``tls_verbose_logging`` and ``tls_ciphers``
+    (:issue:`2111`, :issue:`3392`, :issue:`3442`, :issue:`3450`)
+
+*   :class:`~scrapy.loader.ItemLoader` now turns the values of its input item
+    into lists::
+
+        >>> item = MyItem()
+        >>> item['field'] = 'value1'
+        >>> loader = ItemLoader(item=item)
+        >>> item['field']
+        ['value1']
+
+    This is needed to allow adding values to existing fields
+    (``loader.add_value('field', 'value2')``).
+
+    (:issue:`3804`, :issue:`3819`, :issue:`3897`, :issue:`3976`, :issue:`3998`,
+    :issue:`4036`)
+
+See also :ref:`1.8-deprecation-removals` below.
+
+
+New features
+~~~~~~~~~~~~
+
+*   A new :meth:`Request.from_curl <scrapy.http.Request.from_curl>` class
+    method allows :ref:`creating a request from a cURL command
+    <requests-from-curl>` (:issue:`2985`, :issue:`3862`)
+
+*   A new :setting:`ROBOTSTXT_PARSER` setting allows choosing which robots.txt_
+    parser to use. It includes built-in support for
+    :ref:`RobotFileParser <python-robotfileparser>`,
+    :ref:`Protego <protego-parser>` (default), :ref:`Reppy <reppy-parser>`, and
+    :ref:`Robotexclusionrulesparser <rerp-parser>`, and allows you to
+    :ref:`implement support for additional parsers
+    <support-for-new-robots-parser>` (:issue:`754`, :issue:`2669`,
+    :issue:`3796`, :issue:`3935`, :issue:`3969`, :issue:`4006`)
+
+*   A new :setting:`ROBOTSTXT_USER_AGENT` setting allows defining a separate
+    user agent string to use for robots.txt_ parsing (:issue:`3931`,
+    :issue:`3966`)
+
+*   :class:`~scrapy.spiders.Rule` no longer requires a :class:`LinkExtractor
+    <scrapy.linkextractors.lxmlhtml.LxmlLinkExtractor>` parameter
+    (:issue:`781`, :issue:`4016`)
+
+*   Use the new :setting:`DOWNLOADER_CLIENT_TLS_CIPHERS` setting to customize
+    the TLS/SSL ciphers used by the default HTTP/1.1 downloader (:issue:`3392`,
+    :issue:`3442`)
+
+*   Set the new :setting:`DOWNLOADER_CLIENT_TLS_VERBOSE_LOGGING` setting to
+    ``True`` to enable debug-level messages about TLS connection parameters
+    after establishing HTTPS connections (:issue:`2111`, :issue:`3450`)
+
+*   Callbacks that receive keyword arguments
+    (see :attr:`Request.cb_kwargs <scrapy.http.Request.cb_kwargs>`) can now be
+    tested using the new :class:`@cb_kwargs
+    <scrapy.contracts.default.CallbackKeywordArgumentsContract>`
+    :ref:`spider contract <topics-contracts>` (:issue:`3985`, :issue:`3988`)
+
+*   When a :class:`@scrapes <scrapy.contracts.default.ScrapesContract>` spider
+    contract fails, all missing fields are now reported (:issue:`766`,
+    :issue:`3939`)
+
+*   :ref:`Custom log formats <custom-log-formats>` can now drop messages by
+    having the corresponding methods of the configured :setting:`LOG_FORMATTER`
+    return ``None`` (:issue:`3984`, :issue:`3987`)
+
+*   A much improved completion definition is now available for Zsh_
+    (:issue:`4069`)
+
+
+Bug fixes
+~~~~~~~~~
+
+*   :meth:`ItemLoader.load_item() <scrapy.loader.ItemLoader.load_item>` no
+    longer makes later calls to :meth:`ItemLoader.get_output_value()
+    <scrapy.loader.ItemLoader.get_output_value>` or
+    :meth:`ItemLoader.load_item() <scrapy.loader.ItemLoader.load_item>` return
+    empty data (:issue:`3804`, :issue:`3819`, :issue:`3897`, :issue:`3976`,
+    :issue:`3998`, :issue:`4036`)
+
+*   Fixed :class:`~scrapy.statscollectors.DummyStatsCollector` raising a
+    :exc:`TypeError` exception (:issue:`4007`, :issue:`4052`)
+
+*   :meth:`FilesPipeline.file_path
+    <scrapy.pipelines.files.FilesPipeline.file_path>` and
+    :meth:`ImagesPipeline.file_path
+    <scrapy.pipelines.images.ImagesPipeline.file_path>` no longer choose
+    file extensions that are not `registered with IANA`_ (:issue:`1287`,
+    :issue:`3953`, :issue:`3954`)
+
+*   When using botocore_ to persist files in S3, all botocore-supported headers
+    are properly mapped now (:issue:`3904`, :issue:`3905`)
+
+*   FTP passwords in :setting:`FEED_URI` containing percent-escaped characters
+    are now properly decoded (:issue:`3941`)
+
+*   A memory-handling and error-handling issue in
+    :func:`scrapy.utils.ssl.get_temp_key_info` has been fixed (:issue:`3920`)
+
+
+Documentation
+~~~~~~~~~~~~~
+
+*   The documentation now covers how to define and configure a :ref:`custom log
+    format <custom-log-formats>` (:issue:`3616`, :issue:`3660`)
+
+*   API documentation added for :class:`~scrapy.exporters.MarshalItemExporter`
+    and :class:`~scrapy.exporters.PythonItemExporter` (:issue:`3973`)
+
+*   API documentation added for :class:`~scrapy.item.BaseItem` and
+    :class:`~scrapy.item.ItemMeta` (:issue:`3999`)
+
+*   Minor documentation fixes (:issue:`2998`, :issue:`3398`, :issue:`3597`,
+    :issue:`3894`, :issue:`3934`, :issue:`3978`, :issue:`3993`, :issue:`4022`,
+    :issue:`4028`, :issue:`4033`, :issue:`4046`, :issue:`4050`, :issue:`4055`,
+    :issue:`4056`, :issue:`4061`, :issue:`4072`, :issue:`4071`, :issue:`4079`,
+    :issue:`4081`, :issue:`4089`, :issue:`4093`)
+
+
+.. _1.8-deprecation-removals:
+
+Deprecation removals
+~~~~~~~~~~~~~~~~~~~~
+
+*   ``scrapy.xlib`` has been removed (:issue:`4015`)
+
+
+Deprecations
+~~~~~~~~~~~~
+
+*   The LevelDB_ storage backend
+    (``scrapy.extensions.httpcache.LeveldbCacheStorage``) of
+    :class:`~scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware` is
+    deprecated (:issue:`4085`, :issue:`4092`)
+
+*   Use of the undocumented ``SCRAPY_PICKLED_SETTINGS_TO_OVERRIDE`` environment
+    variable is deprecated (:issue:`3910`)
+
+*   ``scrapy.item.DictItem`` is deprecated, use :class:`~scrapy.item.Item`
+    instead (:issue:`3999`)
+
+
+Other changes
+~~~~~~~~~~~~~
+
+*   Minimum versions of optional Scrapy requirements that are covered by
+    continuous integration tests have been updated:
+
+    *   botocore_ 1.3.23
+    *   Pillow_ 3.4.2
+
+    Lower versions of these optional requirements may work, but it is not
+    guaranteed (:issue:`3892`)
+
+*   GitHub templates for bug reports and feature requests (:issue:`3126`,
+    :issue:`3471`, :issue:`3749`, :issue:`3754`)
+
+*   Continuous integration fixes (:issue:`3923`)
+
+*   Code cleanup (:issue:`3391`, :issue:`3907`, :issue:`3946`, :issue:`3950`,
+    :issue:`4023`, :issue:`4031`)
+
+
+.. _release-1.7.4:
+
 Scrapy 1.7.4 (2019-10-21)
 -------------------------
 
@@ -18,22 +221,31 @@ makes later calls to :meth:`ItemLoader.get_output_value()
 <scrapy.loader.ItemLoader.get_output_value>` or :meth:`ItemLoader.load_item()
 <scrapy.loader.ItemLoader.load_item>` return empty data.
 
+
+.. _release-1.7.3:
+
 Scrapy 1.7.3 (2019-08-01)
 -------------------------
 
 Enforce lxml 4.3.5 or lower for Python 3.4 (:issue:`3912`, :issue:`3918`).
 
+
+.. _release-1.7.2:
+
 Scrapy 1.7.2 (2019-07-23)
 -------------------------
 
 Fix Python 2 support (:issue:`3889`, :issue:`3893`, :issue:`3896`).
 
 
+.. _release-1.7.1:
+
 Scrapy 1.7.1 (2019-07-18)
 -------------------------
 
 Re-packaging of Scrapy 1.7.0, which was missing some changes in PyPI.
 
+
 .. _release-1.7.0:
 
 Scrapy 1.7.0 (2019-07-18)
@@ -568,7 +780,7 @@ Scrapy 1.5.2 (2019-01-22)
 
   See :ref:`telnet console <topics-telnetconsole>` documentation for more info
 
-* Backport CI build failure under GCE environemnt due to boto import error.
+* Backport CI build failure under GCE environment due to boto import error.
 
 .. _release-1.5.1:
 
@@ -2830,23 +3042,35 @@ First release of Scrapy.
 
 
 .. _AJAX crawleable urls: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?csw=1
+.. _botocore: https://github.com/boto/botocore
 .. _chunked transfer encoding: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
 .. _ClientForm: http://wwwsearch.sourceforge.net/old/ClientForm/
 .. _Creating a pull request: https://help.github.com/en/articles/creating-a-pull-request
+.. _cryptography: https://cryptography.io/en/latest/
 .. _cssselect: https://github.com/scrapy/cssselect/
 .. _docstrings: https://docs.python.org/glossary.html#term-docstring
 .. _KeyboardInterrupt: https://docs.python.org/library/exceptions.html#KeyboardInterrupt
+.. _LevelDB: https://github.com/google/leveldb
 .. _lxml: http://lxml.de/
 .. _marshal: https://docs.python.org/2/library/marshal.html
 .. _parsel.csstranslator.GenericTranslator: https://parsel.readthedocs.io/en/latest/parsel.html#parsel.csstranslator.GenericTranslator
 .. _parsel.csstranslator.HTMLTranslator: https://parsel.readthedocs.io/en/latest/parsel.html#parsel.csstranslator.HTMLTranslator
 .. _parsel.csstranslator.XPathExpr: https://parsel.readthedocs.io/en/latest/parsel.html#parsel.csstranslator.XPathExpr
 .. _PEP 257: https://www.python.org/dev/peps/pep-0257/
+.. _Pillow: https://python-pillow.org/
+.. _pyOpenSSL: https://www.pyopenssl.org/en/stable/
 .. _queuelib: https://github.com/scrapy/queuelib
+.. _registered with IANA: https://www.iana.org/assignments/media-types/media-types.xhtml
 .. _resource: https://docs.python.org/2/library/resource.html
+.. _robots.txt: http://www.robotstxt.org/
 .. _scrapely: https://github.com/scrapy/scrapely
+.. _service_identity: https://service-identity.readthedocs.io/en/stable/
+.. _six: https://six.readthedocs.io/
 .. _tox: https://pypi.python.org/pypi/tox
+.. _Twisted: https://twistedmatrix.com/trac/
 .. _Twisted - hello, asynchronous programming: http://jessenoller.com/blog/2009/02/11/twisted-hello-asynchronous-programming/
 .. _w3lib: https://github.com/scrapy/w3lib
 .. _w3lib.encoding: https://github.com/scrapy/w3lib/blob/master/w3lib/encoding.py
 .. _What is cacheable: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1
+.. _zope.interface: https://zopeinterface.readthedocs.io/en/latest/
+.. _Zsh: https://www.zsh.org/
diff --git a/docs/topics/logging.rst b/docs/topics/logging.rst
@@ -198,8 +198,9 @@ to override some of the Scrapy settings regarding logging.
 Custom Log Formats
 ------------------
 
-A custom log format can be set for different actions by extending :class:`~scrapy.logformatter.LogFormatter` class
-and making :setting:`LOG_FORMATTER` point to your new class.
+A custom log format can be set for different actions by extending
+:class:`~scrapy.logformatter.LogFormatter` class and making
+:setting:`LOG_FORMATTER` point to your new class.
 
 .. autoclass:: scrapy.logformatter.LogFormatter
    :members:

diff --git a/scrapy/logformatter.py b/scrapy/logformatter.py
@@ -29,7 +29,7 @@ class LogFormatter(object):
     *   ``args`` should be a tuple or dict with the formatting placeholders for ``msg``.
         The final log message is computed as ``msg % args``.
 
-    Users can define their own ``LogFormatter`` class if they want to customise how
+    Users can define their own ``LogFormatter`` class if they want to customize how
     each action is logged or if they want to omit it entirely. In order to omit
     logging an action the method must return ``None``.