New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate timeout errors by type #850
base: master
Are you sure you want to change the base?
Conversation
Excon used to raise Excon::Error::Timeout for any type of timeout. This commit splits this error into multiple specific ones, with the following inheritance hierarchy: Timeout |_ReadTimeout |_WriteTimeout |_ConnectTimeout |_ConnectReadTimeout |_ConnectWriteTimeout
def self.described_as(human_name) | ||
new("#{human_name.to_s.tr('_', ' ')} timeout reached") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this to be the most elegant way to give us a little flexibility while maintaining DRY-ness, and keeping exceptions completely vanilla otherwise. (Not overwriting their initialize
, etc).
Also, I was tempted to put .to_s.tr(…)
into the caller, but held off on it for now. Curious what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't totally sure about this part. Other than request timeouts, I think they all just use their type for the human name. And especially if we made request timeouts their own thing (as I suggest in another comment), even that distinction is no longer true. At that point, it seems like it might be simpler to drop the human_name
and described_as
parts and move it all up into the by_type
method as you suggest. What do you think?
def self.by_type(type, human_name = type) | ||
case type | ||
when :connect_read | ||
Excon::Errors::ConnectReadTimeout.described_as(human_name) | ||
when :connect_write | ||
Excon::Errors::ConnectWriteTimeout.described_as(human_name) | ||
when :read | ||
Excon::Errors::ReadTimeout.described_as(human_name) | ||
when :write | ||
Excon::Errors::WriteTimeout.described_as(human_name) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if by_type
is clear enough. It seems to make sense in caller context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems pretty good. The only further thing that comes to mind would be to tweak it toward something like new_by_type
since it is kind of a specialized constructor that might make it a little more obvious the intention. What do you think?
@@ -364,6 +364,9 @@ def select_with_timeout(socket, type) | |||
if @data.include?(:deadline) | |||
request_timeout = request_time_remaining | |||
|
|||
# If request timeout is already reached, there's no need to proceed. | |||
raise(Excon::Errors::Timeout.by_type(type, 'request')) if request_timeout <= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, while we are at it, that it might actually make sense to break this out into Excon::Errors::RequestTimeout (another child of Timeout I would think). That would be inline with the other changes I think, and again provide a bit more information to users (who can also just deal with the parent class as they used to if they prefer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only concern is that if we do that, we lose the benefit of being able to tell which specific timeout caused the overall request timeout. It would be powerful if I could set timeout on the entire request, but retry only on connect timeouts.
That's why I opted to raise specific exceptions, while describing them in the message as 'request timeout reached'.
Another alternative is to repeat the whole hierarchy for Request, and raise one of the request exceptions when it's supposed to be a request timeout:
Timeout
|_ReadTimeout
|_WriteTimeout
|_ConnectTimeout
|_ConnectReadTimeout
|_ConnectWriteTimeout
|_RequestTimeout
|_RequestReadTimeout
|_RequestWriteTimeout
|_RequestConnectTimeout
|_RequestConnectReadTimeout
|_RequestConnectWriteTimeout
this seems a little excessive however.
Are you sure you'd want to hide the specific timeout error behind RequestTimeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention this detail: there is already an exception called RequestTimeout
(code 408), which would be in conflict if we did this.
Edit: I'll wait for us to make this decision before we proceed, since this affects a few other things. (No rush or pressure, I'm already surprised at how responsive you are.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, right...
For some context, if I recall correctly the request timeout here is harder to pin down to a particular cause in the same way.
ie I expect read timeout is specifically that I started waiting at the beginning of a read and it took longer than I wanted, so I raised. Same thing for write or connect errors.
I believe the intention with the error we are talking about here is actually a timeout for the entirety of the request. So the timer starts as soon as the request process begins and it times out as soon as the limit is surpassed. That being said, it's possible for instance that the timeout actually would occur in between the other activities during excon's own processing. ie if connect took just slightly less than the request limit, the timeout would technically occur in between connect and write. In that case, it wouldn't really be correct to say that the timout was either connect or write (and hence the generic error we see here).
Does that make sense?
All that being said, I'm not sure that "request" is a very good descriptor of these expectations. Maybe it should be something like "OverallTimeout" or "TotalTimeout" or something (I'm definitely open to suggestions). It's also possible that this is more trouble than it's worth and something like what you have done is good enough. What do you think?
I'm glad to happily surprise you in terms of responsiveness, though paradoxically I was feeling bad for how long it took to get back to you...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible that this is more trouble than it's worth and something like what you have done is good enough.
Could be, but you're right that we have to think this through.
it's possible for instance that the timeout actually would occur in between the other activities
It's true, we only check monotonic clock once per operation, so we can't know when the deadline was actually crossed. Could've been during the beginning of the current phase, the end of the previous one, or in between.
But if you think about it, for "RequestTimeout" we don't actually care exactly when timeout occurred. We only care which phases of request completed successfully at the moment of raising this error. So maybe we indeed should go with something like TotalTimeout
(better name pending), but add a piece of metadata on it, like:
rescue Excon::Errors::TotalTimeout => e
puts e.completed_request_phases # => [:connect_write, :connect_read, :write]
end
If request timeout is not set by the user, then we fallback to specific timeout errors. I don't know the code deep enough to tell if this is easy to do, but at least it avoids the need for impossible timeout precision. How does this sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. Naming wise maybe we could combine the old and new ideas, so something like OverallRequestTimeout
or TotalRequestTimeout
? I kind of like overall between those I think. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OverallRequestTimeout
looks like a good option. We could also consider a different angle: RequestDurationTimeout
, RequestSpanTimeout
. What do you think about these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah. I really like RequestDurationTimeout
. I think that is probably my new favorite, great idea!
I like it so far, offered a few suggestions as well. Thanks! Breaking Changes
Tests
|
@geemus Couple of updates.
|
Excon used to raise
Excon::Error::Timeout
for any type of timeout. This PR splits this error into multiple specific ones, with the following inheritance hierarchy:Since all timeout errors inherit from
Excon::Error::Timeout
, we maintain backwards compatibility.A bonus benefit is that even when we raise a generic "request timeout reached" we are still raising the correct type of exception, such that a user can tell the phase of the request in which timeout actually occurred. This makes the request timeout feature more powerful.
Breaking changes
DEFAULT_RETRY_ERRORS
no longer includes read and write timeouts, only connect. It seems to me the safer option, but I wonder how you feel about making this change here.Questions
We could probably use more tests. If you think we should, where should I add them? (test, tests, or spec)?