Skip to content
This repository has been archived by the owner on Aug 16, 2021. It is now read-only.

First-class support to pass multiple Errors upstream #293

Open
ksf opened this issue Jan 8, 2019 · 4 comments
Open

First-class support to pass multiple Errors upstream #293

ksf opened this issue Jan 8, 2019 · 4 comments

Comments

@ksf
Copy link

ksf commented Jan 8, 2019

As motivating example, let's say I want to create a file and pass its path to a child process, so that it can talk to me via unix sockets, all functions returning Results:

let sock_path = create_socket()?;
let res = spawn_wait_child(sock_path);
delete_socket(sock_path)?;
return res;

Now suppose that a couple of umounts happen directly after create_socket, causing first the child executable to not be found, and then delete_socket failing, too: The user will never see the error arising from spawn_wait_child. Ideally, what I'd want to write is something like this:

let sock_path = create_socket()?;
let res = spawn_wait_child(sock_path);
let cleanup = delete_socket(sock_path);
return res.as_well_as(cleanup);

If either res or cleanup fails we get the respective single error, if both fail we get something like

myprog: multiple errors: 
    spawn: fooprog: No such file or directory
    delete: barsock: No such file or directory

Glancing at the API, it seems to me that one option would be to change cause and find_root_cause to return iterators instead of an option / a single value, leading to a tree of causes.

(Actually, that implementation is a better semantic match in situations where there's multiple ways in which an operation could succeed (e.g. multiple config file paths being known), each being tried, and all of them failing. I don't think it's necessary to distinguish "one of" vs. "all of" for error reporting purposes, though... and also having downcast_ref etc. return iterators smells like overkill).

@yoshuawuyts
Copy link

Problem Reformulation

Thanks for posting this! It took me a little bit to realize what your suggestion was. Perhaps others might have a similar experience, so please allow me to attempt to re-articulate the main question your proposal revolves around:

If two errors happen concurrently, how can we prevent the loss of information by only being able to bubble up one error to the caller? How can we include information of both errors?

In your example we want to handle errors caused by spawn_wait_child, but always call the (fallible) cleanup method for the socket (e.g. std::os::unix::UnixStream.shutdown) regardless of success. This means there's a scenario where both might fail, and we'd like to include both failures in the error handling.

Opinion

Personally I'm sympathetic to this idea. If multiple errors can happen in parallel it seems reasonable to want to report all failures.

However there is an argument to only allowing a single error to be bubbled up: if one task errors out, and the other task hangs or takes a long time to fail, there is a chance that all error information is lost. This is why it seems to be common practice in concurrent systems to introduce a side channel for logging errors without bubbling them back up all the way.

I'm not sure what the right response here is. I feel that the intersection of concurrency & error handling, and by extent concurrency & fallible destructors is somewhat underexplored. I'm not sure what the best patterns are, but I think it would be a great idea to talk more about this!

@KevinMGranger
Copy link

Just to clarify, I don't think the original post's example was a parallel failure, so the concerns about handling parallel failures wouldn't necessarily apply to it.

@yoshuawuyts
Copy link

@KevinMGranger Ah yes, my bad. You're absolutely right. Parallelism is a related use case that I'm introducing, but separate from the example in the initial post.

@ksf
Copy link
Author

ksf commented Mar 30, 2020

The behaviour you'd want with parallel error handling can vary wildly so I wouldn't be worried about providing actual behaviour, just mechanism. With multi-errors you could e.g. report the two errors you got, and a third one saying "channel XYZ timed out, status unknown", or "operation aborted before channel XYZ reported back, error reporting might be incomplete", or whatever is oportune for the application.

(If an issue is still open, is it technically not necromancy?)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants