Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seal ArrowNativeType and OffsetSizeTrait for safety (#1028) #1819

Merged
merged 2 commits into from Jun 8, 2022

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Jun 8, 2022

Which issue does this PR close?

Closes #1028

Rationale for this change

ArrowNativeType is used to indicate "trivially safely transmutable" within the buffer implementations. If client can create their own implementations they can do potentially unwise things.

What changes are included in this PR?

Makes ArrowNativeType sealed

Are there any user-facing changes?

Technically yes, practically I can't think of any valid use-cases this will break.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 8, 2022
@alamb alamb self-requested a review June 8, 2022 10:59
@HaoYang670
Copy link
Contributor

Great! This could also seal the OffsetSizeTrait!

@tustvold tustvold changed the title Seal ArrowNativeType (#1028) Seal ArrowNativeType and OffsetSizeTrait (#1028) Jun 8, 2022
@tustvold
Copy link
Contributor Author

tustvold commented Jun 8, 2022

Good point, this will transitively also seal OffsetSizeTrait

@@ -19,6 +19,10 @@ use super::DataType;
use half::f16;
use serde_json::{Number, Value};

mod private {
pub trait Sealed {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could implement the Seal trait in the private mod:

mod private {
    pub trait Sealed {}
    impl Sealed for i8 {}
    impl Sealed for i16 {}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer implementing it with the rest of the implementations for the type. It makes it clearer why it is implemented, and theoretically we may macroify it later

@alamb alamb added the api-change Changes to the arrow API label Jun 8, 2022
@@ -19,6 +19,10 @@ use super::DataType;
use half::f16;
use serde_json::{Number, Value};

mod private {
pub trait Sealed {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add some docstrings about why it is not allowed to ArrowNativeType

Using the text from the PR description "ArrowNativeType is used to indicate "trivially safely transmutable" within the buffer implementations. If client can create their own implementations they can do potentially unwise things."

would be a good starting point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in a64575e

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR's description says "seal OffsetSizeTrait but I didn't see any change to that trait in https://github.com/tustvold/arrow-rs/blob/seal-arrow-native-type/arrow/src/array/array_list.rs#L34 -- maybe I am missing something

I verified ArrowNativeType can't be extended locally with this program (pointing at an arrow-rs checkout with this branch) ✅

use arrow::datatypes::{ArrowNativeType, JsonSerializable};
use serde_json::{Value};


#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
struct MyAwesomeStruct {}

impl std::str::FromStr for MyAwesomeStruct {
    type Err = String;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        todo!()
    }
}

impl JsonSerializable for MyAwesomeStruct {
    fn into_json_value(self) -> Option<Value> {
        todo!()
    }
}

impl ArrowNativeType for MyAwesomeStruct {
    fn from_usize(_: usize) -> Option<Self> {
        None
    }

    fn to_usize(&self) -> Option<usize> {
        None
    }

    fn to_isize(&self) -> Option<isize> {
        None
    }

    fn from_i32(_: i32) -> Option<Self> {
        None
    }

    fn from_i64(_: i64) -> Option<Self> {
        None
    }

    fn from_i128(_: i128) -> Option<Self> {
        None
    }
}

pub fn main() {
    let  s = MyAwesomeStruct{};
}

And got the expected error:

error[E0277]: the trait bound `MyAwesomeStruct: datatypes::native::private::Sealed` is not satisfied
  --> src/main.rs:22:6
   |
22 | impl ArrowNativeType for MyAwesomeStruct {
   |      ^^^^^^^^^^^^^^^ the trait `datatypes::native::private::Sealed` is not implemented for `MyAwesomeStruct`
   |
note: required by a bound in `ArrowNativeType`
  --> /Users/alamb/Software/arrow-rs/arrow/src/datatypes/native.rs:44:7
   |
44 |     + private::Sealed
   |       ^^^^^^^^^^^^^^^ required by this bound in `ArrowNativeType`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `rust_arrow_playground` due to previous error

Compilation exited abnormally with code 101 at Wed Jun  8 09:19:45

And it correctly prevents me from extending it:

...

impl arrow::datatypes::native::private::Sealed  for MyAwesomeStruct {
}
...

Yields

-*- mode: compilation; default-directory: "~/Software/rust_arrow_playground/" -*-
Compilation started at Wed Jun  8 09:20:58

cd /Users/alamb/Software/rust_arrow_playground/ &&  RUST_BACKTRACE=1 CARGO_TARGET_DIR=/Users/alamb/Software/df-target cargo run
   Compiling rust_arrow_playground v0.1.0 (/Users/alamb/Software/rust_arrow_playground)
error[E0603]: module `native` is private
  --> src/main.rs:22:24
   |
22 | impl arrow::datatypes::native::private::Sealed  for MyAwesomeStruct {
   |                        ^^^^^^ private module
   |
note: the module `native` is defined here
  --> /Users/alamb/Software/arrow-rs/arrow/src/datatypes/mod.rs:27:1
   |
27 | mod native;
   | ^^^^^^^^^^^

For more information about this error, try `rustc --explain E0603`.
error: could not compile `rust_arrow_playground` due to previous error

Compilation exited abnormally with code 101 at Wed Jun  8 09:20:58

@tustvold
Copy link
Contributor Author

tustvold commented Jun 8, 2022

This PR's description says "seal OffsetSizeTrait but I didn't see any change to that trait in

  • ArrowNativeType is sealed and so cannot be implemented for types outside of arrow-rs
  • OffsetSizeTrait can only be implemented for ArrowNativeType
  • Trait orphan rules prevent implementing OffsetSizeTrait for foreign types outside of arrow-rs

These three things together seal OffsetSizeTrait

impl ArrowNativeType for f64 {}
impl private::Sealed for f64 {}

/// Allows conversion from supported Arrow types to a byte slice.
pub trait ToByteSlice {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ToByteSlice trait would also be a candiate for sealing as it also used for transmuting and should only be used on ArrowNativeTypes and slices of ArrowNativeType.

Copy link
Contributor Author

@tustvold tustvold Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ToByteSlice assumes transmutability as far as I can tell, although the blanket impl for ArrowNativeType is (which is fine)? Am I missing something?

pub trait ToByteSlice {
    /// Converts this instance into a byte slice
    fn to_byte_slice(&self) -> &[u8];
}

impl<T: ArrowNativeType> ToByteSlice for [T] {
    #[inline]
    fn to_byte_slice(&self) -> &[u8] {
        let raw_ptr = self.as_ptr() as *const T as *const u8;
        unsafe {
            std::slice::from_raw_parts(raw_ptr, self.len() * std::mem::size_of::<T>())
        }
    }
}

impl<T: ArrowNativeType> ToByteSlice for T {
    #[inline]
    fn to_byte_slice(&self) -> &[u8] {
        let raw_ptr = self as *const T as *const u8;
        unsafe { std::slice::from_raw_parts(raw_ptr, std::mem::size_of::<T>()) }
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I did not word that right. The question is rather whether we want to support other (external) types to be used for example in MutableBuffer:

pub fn push<T: ToByteSlice>(&mut self, item: T)

It's probably fine as it would only be unsound if the trait impl does something explicitly unsound.

@tustvold tustvold merged commit ba38ebe into apache:master Jun 8, 2022
@HaoYang670

This comment was marked as outdated.

@alamb alamb changed the title Seal ArrowNativeType and OffsetSizeTrait (#1028) Seal ArrowNativeType and OffsetSizeTrait for safety (#1028) Jun 9, 2022
@alamb alamb added the security label Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API arrow Changes to the arrow crate security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider sealing ArrowNativeType
4 participants