New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.6.0 breaks calling multiple Spider in CrawlerProcess() #5435
Comments
Is it only reproducible if the spider class is the same? |
No, it happens even if different spider class is used.
Following is the last traceback.
|
#5436 (to be included in Scrapy 2.6.2) |
Hi, When will 2.6.2 will be released. My personal project using Thanks |
I have given an ETA a few times, all of them missed, so I think “soon” is the best I can do without lying again. #5525 might delay the release a bit further if is it confirmed to be a breaking change introduced in 2.6. |
I switched to git branch in requirements.txt in the meantime. It fixes issue (as expected). I know that feeling so "soon" is good enought for me at the moment. |
Hi! We are also wondering when this fix will be released with 2.6.2 We upgraded to scrapy 2.6.1 to fix several vulnerabilities in scrapy but this broke https://security.snyk.io/vuln/SNYK-PYTHON-SCRAPY-2414471 |
You are probably aware, but just in case: there is a middle ground, installing the 2.6 branch from Git, and pinning the latest commit, until 2.6.2 is released. For example:
|
Hi @Gallaecio thanks for the suggestion. While it's true that is a compromise to consider, the problem with that approach is that we will be unable to track possible future vulnerabilities via Snyk if we pin to a git commit hash. Are you aware if there is some kind of ETA for the release 2.6.2? Or is there currently no plan to release anything? |
There is no ETA, but we do plan on releasing it. We have a few things we want to include into 2.6.2 before release, and the maintainers that need to review them are short in time, that is why we have been delaying. |
Description
Since 2.6.0, it breaks calling multiple Spiders from CrawlerProcess() as shown in the common practices
https://docs.scrapy.org/en/latest/topics/practices.html#running-multiple-spiders-in-the-same-process
Steps to Reproduce
Expected behavior: [What you expect to happen]
Following is the result from Scrapy 2.5.1
Actual behavior: [What actually happens]
Spider fails with twisted.internet.error.ReactorAlreadyInstalledError
Reproduces how often: [What percentage of the time does it reproduce?]
Always.
Versions
Scrapy : 2.6.1
lxml : 4.8.0.0
libxml2 : 2.9.12
cssselect : 1.1.0
parsel : 1.6.0
w3lib : 1.22.0
Twisted : 22.1.0
Python : 3.9.10 (main, Jan 17 2022, 08:36:28) - [GCC 11.2.1 20210728 (Red Hat 11.2.1-1)]
pyOpenSSL : 22.0.0 (OpenSSL 1.1.1m 14 Dec 2021)
cryptography : 36.0.1
Platform : Linux-3.10.0-1160.59.1.el7.x86_64-x86_64-with-glibc2.17
Additional context
The intension of using the same MySpider but from CrawlerProcess is to call Scrapy programatically using different initial url and some tweaks to parser depending on the initial url.
I think this is very fair usage and was working fine before 2.6.0.
The text was updated successfully, but these errors were encountered: