New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status code must be an integer value between 1xx and 5xx #271
Comments
It looks like there is some code somwhere that is trying to create a PSR-7 response with status code different from that values. Probably a status code 0 retuned by curl when there are connection problems(DNS, Proxy, SSL, IP, etc). |
Ok, any idea how can I fix this problem? because if I lower the maximum crawler count it will work but I need all the pages to be crawled... And if I remove the set maximum crawler the I will have the same error. |
Guzzle has recently added a check to enforce those numbers. 😞 |
I reckon it could be related to this (bot blocking): Below will throw an error on code 999 as it's out of the range specified. Should really throw an error for blocked bots or code 999 guzzlehttp/psr7/src/Response.php |
Same here. The issue is that since all the crawling is done in a queue, and this exception is not intercepted by the Guzzlehttp Pool class (it only intercepts the RequestException), it bubbles up and can't be intercepted through the crawlFailed() method in this package. Unfortunately i think this is not solvable if not by Guzzle itself (catching any exception and bubbling it up with another callback i think). @freekmurze any thoughts on this? |
I'm thinking this should be changed at Guzzle's end indeed. |
Brought this up to the guzzle team with a possible solution here: guzzle/guzzle#2534 |
Just to jump in as I've been debugging this this evening (my comments here: guzzle/guzzle#2534) this is a core bug in my opinion which was caused by the PSR7 update to the HTTP Response object (which validates the status codes). Essentially these exceptions are getting caught up within the Curl Multi header callback which generates the response, these exceptions should be caught and treated as an error during header collection - currently they aren't and instead the exception is causing no end of trouble. (It goes beyond just occasionally stalling execution - in my tests it's actually blocking requests that would otherwise complete, marking them as failed when they are not - presumably because curl handles aren't being regenerated or similar). There is no simple (read: not a bodge) solution to this problem as far as I can tell, Curl relies on an appropriate response (-1 on error, not an exception) to handle the execution flow. These exceptions are breaking something within the core, and Curl Multi has always been a little rogue within PHP too - couple that with some nice race conditions on the promise state and you've got a headache! A roll-back of PSR7 changes to the Response object and a try/catch to pick up any other exceptions (see my notes in the other thread) would solve this, or alternatively avoiding PSR7 Response altogether (unlikely) or catching the assertion exceptions/pre-validating the status code and handling these gracefully are all doable, but it depends what Guzzle wants in the core. Anyway, no immediate solution unfortunately, this is blocking some internal tech though so I'm on-board on finding a fix for this. |
Hey @williamjulianvicary thanks for taking the time to answer here too. |
For me a |
This is an intermittent problem that you’ll hit on the 6.5.2 branch (latest branch for v6). I’ve submitted a pull request in the link I submitted above that solves this issue - I’m maintaining a fork if you want to alias to that meanwhile! |
thanks @williamjulianvicary, just confirming your patch appears to have fixed this issue for me :) |
Just in case someone still looks for detailed solution:
then add in
then run |
Thanks for the patch! |
@lilessam: I added the patch to a generated OpenAPI PHP client (that uses guzzle as composer dependency). {
// ...
"require": {
"php": ">=7.1",
"ext-curl": "*",
"ext-json": "*",
"ext-mbstring": "*",
"guzzlehttp/guzzle": "^6.2",
"cweagans/composer-patches": "^1.6"
},
"require-dev": {
"phpunit/phpunit": "^7.4",
"squizlabs/php_codesniffer": "~2.6",
"friendsofphp/php-cs-fixer": "~2.12"
},
"autoload": {
"psr-4": { "OpenAPI\\Client\\" : "lib/" }
},
"autoload-dev": {
"psr-4": { "OpenAPI\\Client\\" : "test/" }
},
"extra": {
"patches": {
"guzzlehttp/guzzle": {
"Status code must be an integer value between 1xx and 5xx": "https://patch-diff.githubusercontent.com/raw/guzzle/guzzle/pull/2591.patch"
}
}
}
}
Edit: I had to set |
For anyone tracking this, my PR was merged and is going to be released with 7.2 Guzzle release! :-) |
Fantastic, thanks for letting us know @williamjulianvicary |
@williamjulianvicary very nice work indeed! thanks! |
@williamjulianvicary thank you! |
Hello,
I receive this answer when I'm using the crawler like this:
The text was updated successfully, but these errors were encountered: