Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It should be possible to skip units with unknown versions #562

Open
philipc opened this issue Jun 15, 2021 · 5 comments
Open

It should be possible to skip units with unknown versions #562

philipc opened this issue Jun 15, 2021 · 5 comments

Comments

@philipc
Copy link
Collaborator

philipc commented Jun 15, 2021

When parsing units in various sections, we return an error if the version is unknown, which terminates iteration. This means we operate poorly with unknown DWARF versions (e.g this occurs for .debug_aranges in #559). I think instead we should allow users to write a loop which only terminates for an error that prevents further iteration.

This is how you currently iterate over units, and I don't see how to modify this to allow skipping some or all errors (which is expected, because that was the point of fallible iterators):

let mut units = dwarf.units();
while let Some(unit) = units.next()? {
    ...
}

Instead, I think we need to allow these:

// Continue iterating after any error
for unit in dwarf.units().filter_map(Result::ok) {
    ...
}

// Continue iterating after unknown version
for unit in dwarf.units() {
    match unit {
        Ok(unit) => { ... },
        Err(Error::UnknownVersion(_)) => {},
        Err(e) => return Err(e),
    }
}

This is using a normal Iterator instead of FallibleIterator. Note that this change is only required for iterators that can continue iterating after an error, so it would not be required for DebuggingInformationEntry::attrs(), which is where we first started using fallible iterators. So this would be a reversal of the decision in #44.

Alternatively, we could silently skip units with unknown versions. I don't think this is ideal, but maybe this is ok if this is the behaviour that most users want.

This would be a large breaking change. Thoughts @fitzgen ?

@fitzgen
Copy link
Member

fitzgen commented Jun 15, 2021

We could basically do this without the breaking changes by not having DebugInfoUnitHeadersIter (etc...) fused such that it calls self.input.empty() when it hits an unknown version error, right? And then we would document that these particular errors are recoverable / don't invalidate the iterator.

@philipc
Copy link
Collaborator Author

philipc commented Jun 15, 2021

I think it should only fuse if it fails to read the unit length (I'll check later that this is what other DWARF consumers do), in which case we would document that you should iterate until Ok(None) is returned.

So if we did that, to correctly iterate you would have to do:

// Continue iterating after any error
let mut units = dwarf.units();
while let Some(unit) = units.next().transpose() {
    match unit {
        Ok(unit) => { ... }
        Err(_) => {}
    }
}

// Continue iterating after unknown version
let mut units = dwarf.units();
while let Some(unit) = units.next().transpose() {
    match unit {
        Ok(unit) => { ... }
        Err(Error::UnknownVersion(_)) => {}
        Err(e) => return Err(e),
    }
}

This avoids the breaking change, but the downside is that all existing users are continuing to do the wrong thing without warning.

Another alternative: adding impl Iterator isn't as large a breaking change as I thought. It may require users to disambiguate between FallibleIterator and Iterator but I think that's all? So maybe it's enough to add impl Iterator and deprecate DebugInfoUnitHeadersIter::next (but we can't deprecate the impl FallibleIterator).

@philipc
Copy link
Collaborator Author

philipc commented Jun 16, 2021

I'll check later that this is what other DWARF consumers do

Other DWARF consumers don't handle this either, so looks like this issue is a low priority. I looked at llvm, gdb, and libbacktrace. Common behaviour when parsing .debug_info is to stop parsing the section at the first error of any kind, but only libbacktrace treats unknown versions as an error. For version > 5, gdb prints an error and then treats it as version 5. llvm simply treats it as version 5.

@khuey
Copy link
Contributor

khuey commented Jul 13, 2021

You can replace your while loop with

loop {
    let unit = match iter.next() {
        Ok(Some(unit)) => unit,
        Ok(None) => break,
        Err(Error::UnknownVersion(_)) => continue,
        Err(e) => return Err(e),
    };
    // do work
}

I don't think there's any need to abandon FallibleIterator.

@philipc
Copy link
Collaborator Author

philipc commented Jul 13, 2021

These loops are all too complicated compared to what existing code uses. The easiest thing to do needs to be the right thing. Currently the easiest thing is to stop on the first error, and that's what other DWARF consumers do too, so it's probably good enough. If we change the fusing, it will be possible to skip errors if someone really wanted to, but it still wouldn't be what most consumers do, and so I'm not planning to change that until someone wants it. Alternatively, if someone wants to argue that skipping errors is definitely the right thing, then we need to consider how we could make that the easiest thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants