Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add External variant to ParquetError #3285

Closed
Cheappie opened this issue Dec 7, 2022 · 7 comments · Fixed by #3574
Closed

Add External variant to ParquetError #3285

Cheappie opened this issue Dec 7, 2022 · 7 comments · Fixed by #3574
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@Cheappie
Copy link
Contributor

Cheappie commented Dec 7, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

It would be nice to have External(Box<dyn error::Error + Send + Sync>) variant in ParquetError, because right now I cannot bubble up errors with preserving source through existing ParquetError variants that only accepts either String or usize.

If we would add such variant, then impl std::error::Error for ParquetError { ... } should bubble up cause for External variant.

@Cheappie Cheappie added the enhancement Any new improvement worthy of a entry in the changelog label Dec 7, 2022
@Cheappie
Copy link
Contributor Author

Cheappie commented Dec 7, 2022

First attempt has failed, that modification requires more changes that I thought due to primitive String/usize based errors equality

@tustvold
Copy link
Contributor

tustvold commented Dec 7, 2022

Perhaps we could roll this into #2725

@Cheappie
Copy link
Contributor Author

Cheappie commented Dec 8, 2022

Sure that would make sense

@Cheappie
Copy link
Contributor Author

Cheappie commented Dec 9, 2022

I have seen that anyhow crate is being considered. In past I have been exploring if there is any easy way to recover type information without downcasting in caller and actually it is possible by preserving root type id. But I am not sure if there are any corner case around that solution, so please treat it as fun fact.

use anyhow::anyhow;
use std::any::TypeId;
use std::fmt::{Debug, Display, Formatter};
use std::io::ErrorKind::{NotFound, UnexpectedEof};
use std::marker::PhantomData;

pub struct ParquetError {
    root_type_id: TypeId,
    error: anyhow::Error,
}

impl ParquetError {
    pub fn context<C: Display + Send + Sync + 'static>(self, context: C) -> ParquetError {
        let err_with_ctx = self.error.context(context);
        ParquetError {
            root_type_id: self.root_type_id,
            error: err_with_ctx,
        }
    }
}

impl Display for ParquetError {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        Display::fmt(&self.error, f)
    }
}

impl Debug for ParquetError {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        Debug::fmt(&self.error, f)
    }
}

impl std::error::Error for ParquetError {
    fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
        Some(self.error.as_ref())
    }
}

///
/// Known errors
///
pub enum Error<'a> {
    IO(&'a std::io::Error),
    Other(&'a dyn std::error::Error),
}

impl ParquetError {
    pub fn root_cause(&self) -> Error<'_> {
        let root = self.error.root_cause();

        if self.root_type_id == TypeId::of::<std::io::Error>() {
            Error::IO(root.downcast_ref::<std::io::Error>().expect("infallible"))
        } else {
            Error::Other(root)
        }
    }
}

macro_rules! parquet_err {
    ($err:expr) => {{
        let root_type_id = capture_type_id(&$err);
        let error = anyhow!($err);
        ParquetError {
            root_type_id,
            error,
        }
    }};
}

fn capture_type_id<E: 'static>(_: &E) -> TypeId {
    TypeId::of::<E>()
}

#[test]
fn test() {
    let err = parquet_err!(std::io::Error::new(NotFound, "Not found"));

    // easy way to access known errors, without need to repeat downcast_ref
    // match err.root_cause() {
    //     Error::IO(io) => {
    //         println!("Some io error; {:?}", io);
    //     }
    //     Error::Other(other) => {
    //         println!("Other error; {:?}", other);
    //     }
    // }

    println!(
        "{:?}",
        err.context("When looking for")
            .context("Something important")
            .context("Unable to locate parquet metadata")
    );
}

@tustvold
Copy link
Contributor

tustvold commented Dec 9, 2022

I believe anyhow supports this natively with

https://docs.rs/anyhow/latest/anyhow/struct.Error.html#method.root_cause

Additionally the way it handles downcasting in general is if the target type appears anywhere in the chain, so it is agnostic to any additional context added https://docs.rs/anyhow/latest/anyhow/trait.Context.html#effect-on-downcasting

Its a pretty well thought out crate imo 😀

@Cheappie
Copy link
Contributor Author

Cheappie commented Dec 9, 2022

Sure It has that kind of support. My point was It would be interesting pattern if libraries would provide enumeration of known errors that is hidden behind dynamic error that allows enriching context. Anyhow on It's own doesn't know what type of errors we might produce, so any caller that would receive anyhow::Error has to probe It with several downcasts whether it is io err or malformed url error. In the above example we provide enumeration enum Error<'a> { IO(&'a std::io::Error) }, so downcasts are actually hidden and caller can jump into dealing with error right away.

The macro is a boilerplate, anyhow error needs to be created at the call site that, to capture backtrace correctly (at the call site, not in e.g. ParquetError::new(...)).

It is interesting that downcast might match any node in the chain that match provided type. From my perspective the most important is the root cause, because It tells why the error has happened. Anything in between is just an additional context.

tustvold added a commit to tustvold/arrow-rs that referenced this issue Jan 20, 2023
@tustvold tustvold added the parquet Changes to the parquet crate label Jan 27, 2023
@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'parquet'} from #3286

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
2 participants