Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for error source iterators #58520

Open
sfackler opened this issue Feb 16, 2019 · 62 comments
Open

Tracking issue for error source iterators #58520

sfackler opened this issue Feb 16, 2019 · 62 comments
Labels
A-error-handling Area: Error handling B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. PG-error-handling Project group: Error handling (https://github.com/rust-lang/project-error-handling) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@sfackler
Copy link
Member

This covers the <dyn Error>::iter_chain and <dyn Error>::iter_sources methods added in #58289.

@sfackler sfackler added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Feb 16, 2019
@withoutboats
Copy link
Contributor

My preference before stabilizing would be to make changes to make it more like the original failure design, which I still prefer. There should be one iterator, it should begin from this error type, and its name should be sources. If you want all the sources of error less this layer, you can write .skip(1).

@TimDiekmann
Copy link
Member

I like the idea of a convenient interface to iterate sources. However, the interface feels unergonomic as mentioned in #58289 (comment). Additionally I think this should not be stabilized before #53487

@haraldh
Copy link
Contributor

haraldh commented Feb 18, 2019

My preference before stabilizing would be to make changes to make it more like the original failure design, which I still prefer. There should be one iterator, it should begin from this error type, and its name should be sources. If you want all the sources of error less this layer, you can write .skip(1).

Yeah, my first iteration (pun) had only .iter(), but then @sfackler requested the other iter_*()

@faern
Copy link
Contributor

faern commented Mar 29, 2019

These iterators are really good. I have long relied on the display_chain method in error-chain. As I move away from that lib I find it cumbersome to transform my entire chain of errors into a good string representation.

With these iterators I can get the equivalent of display_chain this way:

let display_chain: String = error
    .iter_sources()
    .fold(format!("Error: {}", error), |mut s, e| {
        s.push_str(&format!("\nCaused by: {}", e));
        s
    });

These iterators does not solve the entire problem. But they make it a bit easier to implement.

@haraldh
Copy link
Contributor

haraldh commented May 31, 2019

So, any changes needed here? I really want this to be stabilized

@withoutboats
Copy link
Contributor

withoutboats commented Jul 11, 2019

I think a few changes are needed before this can be stabilized:

  1. First, we need to actually reach a consensus about whether we have both of these APIs. I've been consistent that I think the changes made to failure regarding this issue since I stopped maintaining it are a mistake: we should have one iterator which begins with self and calls cause until it returns None. If users want to skip the self value, its just as simple as adding .skip(1) to that iterator. No one else has really expressed an opinion about that question.
  2. Even if we do have both, I think the names right now are not well-chosen. It would be better to change them to something like .errors() and .sources() (or keep one and just call it one of these two names). We tend to avoid the name iter except for iterating through elements of a collection, modified by its ownership type; for iterators which are semantic values pulled out of a type, we prefer just using the name of that kind of value (as in lines, keys, chars, etc).
  3. We need to decide what to do about the trait object vs provided method problem, discussed on the PR. This involves the struggle around the fact that Box<Error> doesn't implement Error. I think the best solution available to us now is to just have the method repeated in both places.

@withoutboats withoutboats added the A-error-handling Area: Error handling label Jul 12, 2019
@derekdreery
Copy link
Contributor

derekdreery commented Jul 14, 2019

my 2¢: I think the simplest (for users) implementation is a single iter method, with a note in the documentation that

  1. The iterator length is always >= 1, that is, you can call iter.next().unwrap() once without causing a panic.
  2. If you just want the sources, use e.iter().skip(1).
  3. An example like
    fn print_error(e: impl Error) { // or Box<Error> or whatever other actual type this is implemented for
      let mut chain = e.iter();
      eprintln!("error: {}", chain.next().unwrap());
      for source in chain {
        eprintln!("caused by {}", source);
      }

This leaves the option open of an iter_sources method later if it's decided that it's really needed (that can just be implemented as iter().skip(1)).

@BurntSushi
Copy link
Member

I agree with @withoutboats and @derekdreery. I had to carefully read the docs for each method before I figured out how they differed. Moreover, I suspect it will be quite difficult to pick names for both methods that are easy to remember and distinguish. Having one method and requiring the caller to use skip(1) if they don't want the current error seems much nicer IMO.

@withoutboats
Copy link
Contributor

It does seem that many users found the terminology causes confusing for an iterator that returns self (myself, it seemed intuitive, if the documentation says so, that this error can be considered a member of the chain of causality). So a better name than sources might be preferred here. However, both of the other options I can think of are also imperfect:

  • I think iter violates our naming convention, in that I think iter must return the same iterator returned by <&T as IntoIterator>::into_iter, and I don't expect that impl to exist for error types.
  • Error::errors does not seem great either, just because it can be confusing to say that this error contains many errors.

As I said previously, I don't think names like iter_chain are a good choice either, they're pretty far outside of our general style for iterator names.

@faern
Copy link
Contributor

faern commented Jul 14, 2019

I agree causes/sources would be confusing if the iterator contains the error it's called on. In err.causes() I want the errors that caused err. err did not cause itself, nor is it part of the sources for this error. It would not be consistent with the existing err.source() which returns the first error under err not err itself.

I do fully agree that err is part of the chain of causality however, which is why I think something like chain would work.

@derekdreery
Copy link
Contributor

derekdreery commented Jul 15, 2019

@withoutboats

I think iter violates our naming convention, in that I think iter must return the same iterator returned by <&T as IntoIterator>::into_iter, and I don't expect that impl to exist for error types.

Is this set in stone. I think iter is the best name, because it feels familiar, in that the iter() method should return an iterator that makes most sense for the object. I wouldn't expect FromIterator to be implemented for &Error, but if it were to be implemented then it should look like this (i.e. include the base error).

Could the naming convention be changed to "iter must return the same iterator returned by <&T as IntoIterator>::into_iter, or if this is not implemented, it must return the canonical iterator for the object (implying that such a canonical iterator must exist)".

@derekdreery
Copy link
Contributor

To give some more context on why I would like it to be called iter:

When I'm reading the documentation for some object with methods, when I see an iter method I think to myself "what iterator makes most sense for this object". So for a vector I would assume it returns refs to the elements, or for a linked list (which is what our error chain is essentially), I would expect it to walk the nodes and return a node per iteration. I don't think "how is FromIterator implemented for the reference type for this object". Maybe I should be thinking that :P

@haraldh
Copy link
Contributor

haraldh commented Jul 30, 2019

I think iter violates our naming convention, in that I think iter must return the same iterator returned by <&T as IntoIterator>::into_iter, and I don't expect that impl to exist for error types.

Hmm, this works:

impl<'a> IntoIterator for &'a (dyn Error + 'static) {
    type Item = &'a (dyn Error + 'static);
    type IntoIter = ErrorIter<'a>;

    fn into_iter(self) -> Self::IntoIter {
        ErrorIter {
            current: Some(self),
        }
    }
}
let mut iter = <&(dyn Error + 'static) as IntoIterator>::into_iter(&err);

or

for e in err.borrow() as &(dyn Error) {
    eprintln!("{}", e);
}

and iter_chain() (or iter() how I would call it) is implemented for dyn Error

@withoutboats
Copy link
Contributor

@haraldh We can add that impl, but I don't think we should (and it seems completely circular to add it just to justify naming the method iter)

@haraldh
Copy link
Contributor

haraldh commented Aug 1, 2019

@haraldh We can add that impl, but I don't think we should ...

Agreed

@teiesti
Copy link
Contributor

teiesti commented Sep 4, 2019

I'd like to point to the issues discussing the addition (#48420) and stabilization (#50894) of std::path::Path::ancestors. At first glance this API might seem unrelated but the situation back then was quite similar:

  1. The objective was to add an iterator that recursively calls an existing, well-known method.
  2. There was a discussion if self should be included.
  3. There was a lengthy discussion about the name.

In the end, the rationale was:

  1. Such iterators are helpful. They should better be stabilized sooner than later.
  2. self should be included. It is easy to .skip(1) it. Not including self is harmful because it is harder to add self to the iterator than to remove it. (I think, it is not a good idea to add two APIs just to cover both semantics because that will clutter both, the namespace and the human mind.)
  3. The chosen name should be telling and reflect the fact that self is included. Adding an "s" to the original method name is a good to start unless that leads to unwanted associations which, I think, is often the case. (I think, .iter() is not a good choice, because it is natural to iterate over vector items but it is not natural to iterate over error sources. In theory, there may be other properties of an Error that are a more or equally natural subject to iteration, e.g., descriptions provided in different languages. In particular, a type implementing Error might also want to implement a method called .iter() for whatever reason.)

I'd like to throw two naming ideas into the ring that might be considered for further debate:

  • .ancestors(), because the iterator iterates over the ancestors in a ancestry of errors which are cause to their respective children.
  • .chained() in honor of error-chain and because the iterator iterates over the chain of errors that is somehow included in self.

I'd also like to propose to rename the ErrorIter struct. I think, it is good practice throughout the standard library to name the struct after the method creating it (example).

@ehuss
Copy link
Contributor

ehuss commented Oct 7, 2019

Sorry if this was brought up earlier, I tried to read the RFC/issues, and didn't see it brought up.

What is the intent of how to handle older code that overrides cause but not source? For example, IntoStringError only implements cause. If you iterate over sources (iter_sources), then the cause is lost. In this case it is the Utf8Error which provides useful detail like invalid utf-8 sequence of 1 bytes from index 0 .

Should these errors be changed to implement source instead of cause? Or maybe implement both?

@faern
Copy link
Contributor

faern commented Oct 7, 2019

IMO all new errors should only implement source. All old code that want to stay relevant and usable should be updated to implement source as well. I do not think we should focus the development of new features in libstd to cover deprecated items.

@derekdreery
Copy link
Contributor

IMO all new errors should only implement source. All old code that want to stay relevant and usable should be updated to implement source as well. I do not think we should focus the development of new features in libstd to cover deprecated items.

If we were to do this, could we make an effort to submit PRs to popular crates using cause.

More generally (and slightly OT), is there a mechanism for the community to help crate maintainers with stuff like this?

@haraldh
Copy link
Contributor

haraldh commented Oct 18, 2019

I'd like to point to the issues discussing the addition (#48420) and stabilization (#50894) of std::path::Path::ancestors. At first glance this API might seem unrelated but the situation back then was quite similar:

1. The objective was to add an iterator that recursively calls an existing, well-known method.

2. There was a discussion if `self` should be included.

3. There was a lengthy discussion about the name.

In the end, the rationale was:

1. Such iterators are helpful. They should better be stabilized sooner than later.

2. `self` should be included. It is easy to `.skip(1)` it. Not including `self` is harmful because it is harder to add `self` to the iterator than to remove it. (I think, it is not a good idea to add two APIs just to cover both semantics because that will clutter both, the namespace and the human mind.)

3. The chosen name should be telling and reflect the fact that `self` is included. Adding an "s" to the original method name is a good to start unless that leads to unwanted associations which, I think, is often the case. (I think, `.iter()` is not a good choice, because it is natural to iterate over vector items but it is not natural to iterate over error sources. In theory, there may be other properties of an `Error` that are a more or equally natural subject to iteration, e.g., descriptions provided in different languages. In particular, a type implementing `Error` might also want to implement a method called `.iter()` for whatever reason.)

I'd like to throw two naming ideas into the ring that might be considered for further debate:

* `.ancestors()`, because the iterator iterates over the ancestors in a ancestry of errors which are cause to their respective children.

* `.chained()` in honor of `error-chain` and because the iterator iterates over the chain of errors that is _somehow_ included in `self`.

I'd also like to propose to rename the ErrorIter struct. I think, it is good practice throughout the standard library to name the struct after the method creating it (example).

Opened a PR with this suggestion: #65557

@haraldh
Copy link
Contributor

haraldh commented Oct 21, 2019

So, it seems, that even Path::ancestors() includes itself. So, to avoid confusion and simplify it more, I reduced PR #65557 to only have chained and Chain.

Rationale:

  1. Such iterators are helpful. They should better be stabilized sooner than later.
  2. self should be included. It is easy to .skip(1) it. Not including self is harmful because it is harder to add self to the iterator than to remove it.
  3. The chosen name should be telling and reflect the fact that self is included. .chained() was chosen in honor of error-chain and because the iterator iterates over the chain of errors that is somehow included in self.
  4. The resulting iterator is named Chain because the error::Chain is what we want to have.

@derekdreery
Copy link
Contributor

derekdreery commented Oct 21, 2019

I like the name chained because it indicates that it includes the current error, but connects all the others.

The docs should mention the skip(1) way of avoiding the original error. (This is currently the case in the PR 323f6a4)

@bbqsrc
Copy link

bbqsrc commented Sep 2, 2020

Anything blocking this from stabilization?

@haraldh
Copy link
Contributor

haraldh commented Sep 2, 2020

see #58520 (comment) and following

especially #69161 is annoying

@haraldh
Copy link
Contributor

haraldh commented Sep 14, 2020

Stabilization Report

The current implementation of error source iterators is only implemented on dyn Error as chain(), which returns an iterator beginning with self instead of self.source() (which was discussed here).

This causes an un-ergonomic usage on non &dyn Error types:

    let mut iter = (&b as &(dyn Error)).chain();
    // or
    let mut iter = <dyn Error>::chain(&b);
    // or
    let mut iter = Error::chain(&b);

Solutions to the un-ergonomic usage

Implement chain() on the Error trait

An additional implementation on trait Error could be done via:

pub trait Error: Debug + Display {
    // […]
    fn chain(&self) -> Chain<'_> where Self: Sized + 'static {
        Chain {
            current: Some(self),
        }
    }
}

but that leads to issue #69161

Add an ErrorChain trait

See #58520 (comment) and PR #69163

Remove chain() and add sources() instead

Returning an iterator not starting with self but self.source() can be done in the Error trait directly and so it is implemented for all dyn Error also.

pub trait Error: Debug + Display {
    // […]
    fn sources(&self) -> Chain<'_> {
        Chain {
            current: self.source(),
        }
    }
}

@haraldh
Copy link
Contributor

haraldh commented Sep 14, 2020

@haraldh
Copy link
Contributor

haraldh commented Feb 2, 2021

So, if nobody objects, I will open a PR with the "Remove chain() and add sources() instead" solution, as nothing else provides an ergonomic solution.

@haraldh
Copy link
Contributor

haraldh commented Feb 3, 2021

Here we go: #81705

haraldh added a commit to haraldh/rust-1 that referenced this issue Feb 3, 2021
To produce an error iterator `std::error::Chain` one had to call
`<dyn Error>::chain()`, which was not very ergonomic, because you have to
manually cast to a trait object, if you didn't already have one with
the type erased.

```
    let mut iter = (&my_error as &(dyn Error)).chain();
    // or
    let mut iter = <dyn Error>::chain(&my_error);
    // or
    let mut iter = Error::chain(&my_error);
```

The `chain()` method can't be implemented on the Error trait, because of
rust-lang#69161

`Chain::new()` replaces `<dyn Error>::chain()` as a good alternative
without confusing users, why they can't use `my_error.chain()` directly.

The `Error::sources()` method doesn't have this problem, so implement
it more efficiently, than one could achieve this with `Chain::new().skip(1)`.

Related: rust-lang#58520
@KodrAus
Copy link
Contributor

KodrAus commented Feb 4, 2021

I think @haraldh touched on a reason to consider both a method that iterates over self and its sources and one that just includes its sources: you can call the latter on any error, but can only call the former on an error you can cast into dyn Error + 'static:

// Valid for any `&E: Error + ?Sized`
fn sources(&self) -> Chain<'_> {}

// Valid only for `&E: Error + Sized + 'static`
fn chain(&self) -> Chain<'_> where Self: Sized + 'static {}

So you could write a maximally compatible method to display an error using sources:

fn render(err: impl Error) {
    fn render_inner(err: impl Error) { .. }

    render_inner(&err);

    for source in err.sources() {
        render_inner(source);
    }
}

but you couldn't write that same implementation using chain unless you also constrain the input error to 'static.

You could argue that's a bit of a forced difference though, because in practice I think non-'static errors aren't very useful anyway since they can't participate in the source chain, but currently unless you wrap a Box<dyn Error + 'static> up into a newtype that implements Error the only impl Error you can get from it is non-'static.

@notgull
Copy link

notgull commented Jul 2, 2021

Shouldn't the iterator proper implement FusedIterator?

@dtolnay
Copy link
Member

dtolnay commented Jul 3, 2021

Shouldn't the iterator proper implement FusedIterator?

That's not obvious. There is no documented requirement on https://doc.rust-lang.org/1.53.0/std/error/trait.Error.html#method.source that it behave in a fused way.

@notgull
Copy link

notgull commented Jul 3, 2021

Shouldn't the iterator proper implement FusedIterator?

That's not obvious. There is no documented requirement on https://doc.rust-lang.org/1.53.0/std/error/trait.Error.html#method.source that it behave in a fused way.

Apologies if I'm misinterpreting something, but once the Error object returns a None, that's the end of the error chain, isn't it? You can't get another &(dyn Error + 'static) out of that. I guess you could create an infinite loop, but ending is not required for fused iterators.

@mathstuf
Copy link
Contributor

mathstuf commented Jul 3, 2021

Nothing says the error can't change its mind about what its source is:

impl Error for SillyError {
    fn source(&self) -> Option<&(dyn Error + 'static)> {
        if rand::int() == 1 { Some(&self.source) } else { None }
    }
}

@notgull
Copy link

notgull commented Jul 3, 2021

Nothing says the error can't change its mind about what its source is:

impl Error for SillyError {
    fn source(&self) -> Option<&(dyn Error + 'static)> {
        if rand::int() == 1 { Some(&self.source) } else { None }
    }
}

Is it a requirement for fused iterators to not involve randomness? All I'm saying is that std::error::Chain could implement FusedIterator, since, unless there are plans to change the way that it is implemented, it could get a marginal win in the albeit rare case where one passes it into a function that fuses it.

@mathstuf
Copy link
Contributor

mathstuf commented Jul 3, 2021

Is it a requirement for fused iterators to not involve randomness?

I guess my example might still not be enough as std::error::Chain will assume it's over. The rule for FusedIterator is that once None is returned, None will always be returned on subsequent .next() calls. I suppose whether the last Error is delegated to or if there is some implicit FusedIterator assumption in std::error::Chain would answer this question.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 28, 2022
Some papercuts on error::Error

Renames the chain method, since I chain could mean anything and doesn't refer to a chain of sources (cc rust-lang#58520) (and adds a comment explaining why sources is not a provided method on Error). Renames arguments to the request method from `req` to `demand` since the type is `Demand` rather than Request or Requisition.

r? `@yaahc`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 28, 2022
Some papercuts on error::Error

Renames the chain method, since I chain could mean anything and doesn't refer to a chain of sources (cc rust-lang#58520) (and adds a comment explaining why sources is not a provided method on Error). Renames arguments to the request method from `req` to `demand` since the type is `Demand` rather than Request or Requisition.

r? ``@yaahc``
@nrc
Copy link
Member

nrc commented Aug 30, 2022

In #100955, I renamed chain to sources, it still includes self in the iterator. I added a comment to try and record the issue with why source must be defined in the impl and not the trait. In terms of solutions, this should be fixed by the dyn* work which will permit more methods on trait objects. The other possible solution is to remove self from the iterator (this is a solution because the problem is in converting self to a trait object, which might not work if self is itself a wide pointer which is not a trait object, all the other sources are already trait objects so skipping self means no conversion has to happen). Adding an Unsize bound is not a solution because its not backwards compatible.

@jonhoo
Copy link
Contributor

jonhoo commented Sep 23, 2022

I recently discovered that io::Error doesn't return custom inner errors through Error::source (#101817). I wonder whether we should work around that for iter_chain and iter_sources so that they know how to walk through (and continue after) io::Error::Custom as well?

@zopsicle
Copy link
Contributor

Solutions to the un-ergonomic usage

Another solution is to make sources a function in the module core::error, rather than a method on dyn Error. One would write use std::error; and then error::sources(&error).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-error-handling Area: Error handling B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. PG-error-handling Project group: Error handling (https://github.com/rust-lang/project-error-handling) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests