Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffi::cpython::unicodeobject::tests::{ascii,ucs4} unit tests segfault on s390x #1824

Closed
decathorpe opened this issue Aug 23, 2021 · 18 comments · Fixed by #3015
Closed

ffi::cpython::unicodeobject::tests::{ascii,ucs4} unit tests segfault on s390x #1824

decathorpe opened this issue Aug 23, 2021 · 18 comments · Fixed by #3015

Comments

@decathorpe
Copy link
Contributor

Building pyo3 0.14.3 on s390x on Fedora Rawhide, the following two unit tests crash:

  • ffi::cpython::unicodeobject::tests::ascii
  • ffi::cpython::unicodeobject::tests::ucs4

Because they crash the test runner process, there's no output from those tests, except that they fail.

There are similar issues in the cpython crate:
dgrunwald/rust-cpython#265

Looks like PyUnicode_KIND is a bitfield that is different depending on system endianness (s390x is the only big-endian architecture I have access to).

🌍 Environment

  • Your operating system and version: Fedora Rawhide / 36
  • Your python version: 3.10.0-rc1
  • How did you install python (e.g. apt or pyenv)? Did you use a virtualenv?: default system /usr/bin/python3
  • Your Rust version (rustc --version): 1.54.0 stable on s390x-unknown-linux-gnu
  • Your PyO3 version: 0.14.3
  • Have you tried using latest PyO3 main (replace version = "0.x.y" with git = "https://github.com/PyO3/pyo3")?: no

💥 Reproducing

Running cargo test --release on a s390x machine causes this issue.

@decathorpe
Copy link
Contributor Author

I think I might have found the source of this problem:
https://github.com/PyO3/pyo3/blob/main/src/ffi/cpython/unicodeobject.rs#L53

impl PyASCIIObject {
    #[inline]
    pub fn interned(&self) -> c_uint {
        self.state & 3
    }

    #[inline]
    pub fn kind(&self) -> c_uint {
        (self.state >> 2) & 7
    }

    #[inline]
    pub fn compact(&self) -> c_uint {
        (self.state >> 5) & 1
    }

    #[inline]
    pub fn ascii(&self) -> c_uint {
        (self.state >> 6) & 1
    }

    #[inline]
    pub fn ready(&self) -> c_uint {
        (self.state >> 7) & 1
    }
}

Here, self.state is a u32 value. However, byte order of u32 is reversed on big-endian architectures, so the bitshift and bitwise-and operations produce different results on big-endian vs. little-endian architectures, so this impl block needs to be made endianness-aware somehow, otherwise those functions will return wrong results.

@decathorpe
Copy link
Contributor Author

This also seems to have a follow-up effect in other tests:

failures:
    types::string::tests::test_string_data_ucs2
    types::string::tests::test_string_data_ucs2_invalid
    types::string::tests::test_string_data_ucs4
    types::string::tests::test_string_data_ucs4_invalid

Additionally, types::string::tests::test_string_data_ucs1 causes segfaults, so I can't even get cargo test to print output from those failures.

This is the error message: thread 'main' panicked at 'internal error: entered unreachable code', src/types/string.rs:247:18

Which is kind of obvious now, since the constructed enum variants don't match the expected values on big-endian architectures, and hence, the unreachable!() suddently becomes reachable.

@davidhewitt
Copy link
Member

davidhewitt commented Aug 23, 2021

Yikes, thanks for reporting this.

TBH, I had this concern in #1777 (comment) when this FFI code was added. In the end, CI passed on all platforms so I was reasonably convinced this would be ok. I didn't consider the point that all platforms tested were little-endian.

From further reading just now, it appears to be a really hard problem for Rust to get interactions with C bitfields correct on all platforms:

https://users.rust-lang.org/t/c-structs-with-bit-fields-and-ffi/1429
https://stackoverflow.com/a/6044223

The URLO thread in particular suggests that linking in some C code to achieve this is probably the correct way to solve this problem. I agree, however I don't really want to add complexity to PyO3's build scripts by requiring a C compiler for such a small feature.

PyO3 contributors - what do you think we should do here?

My personal feeling is that we probably ought to rollback the PyStringData API and maybe even consider yanking 0.14.3. We could attempt to "fix" this API on s390x, however the stack overflow answer above suggests that bitfield order is not just down to endianness but more generally unspecified. I haven't found the information in the C reference to back that up, however I'm not feeling comfortable to offer this PyStringData API if we can't be sure on which platforms the FFI code is not UB.

cc @indygreg - I'm concerned that PyOxidizer should probably not depend on this functionality as-implemented; it presumably causes nasty crashes on certain platforms...

@indygreg
Copy link
Contributor

Yay for C being under-specified :/

Instead of deleting the feature globally, perhaps we could conditionally compile it for configurations where we know it to work using #[cfg(target_arch = "...")] and/or #[cfg(target_endian = "...")]?

I'm a huge fan of perfect is the enemy of done and it seems that a working implementation on a subset of [very popular] platforms is strictly better than no implementation at all.

I think it is also reasonable to escalate our concern to the CPython folks and ask them to expose proper C functions (not macros) for accessing the content behind bitfields so downstream consumers don't have to reason about under-defined C semantics. Unfortunately, it might be too late to get said APIs into CPython 3.10. Do you have a quick take on this @vstinner? Context here is the bitfield in PyASCIIObject plus lack of non-macro/preprocessor APIs making it difficult to define proper bindings.

@indygreg
Copy link
Contributor

FWIW my reading of the situation is that use of bitfields with #[repr(C)] can't be relied on in any situation because apparently you can't even rely on the field being its declared width since this implementation detail isn't defined by C.

However, it does appear that on x86_64 compilers that we have test coverage for, the PyASCIIObject.state field does appear to be using 32 bits and this configuration is known to work. It's worth noting that our test coverage is using a Rust-built LLVM/Clang binary linked against a (presumed) GCC (Linux), Clang (macOS), and MSVC (Windows) libpython. This seemingly means all the common x86_64 toolchains agree on the width of state being 32 bits. My gut instinct is this is known to be safe and is safe to ship in some scenarios. But we may want to put it behind a feature flag because we can't guarantee its safety.

But, all other architectures and compiler combinations are not yet known. We'd probably get lucky making the bit shifts endian aware. But this isn't guaranteed to work on all target arches or compilers. (Since Rust always uses LLVM this isn't a concern for PyO3 today: but it is a concern for whatever compiler built libpython and it could be a concern for future Rust implementations not using LLVM.)

Since different compiler toolchains can't agree on the memory layout here, it technically isn't even safe to link libraries built with different compilers since their bit field layout could be different! CPython (or any C library for that matter) shouldn't expose any type with a bit field in its public API and should instead expose function-based APIs for field access and manipulation.

How do you feel about making the bit shifts endian aware and moving the FFI definitions and PyStringData APIs behind an off-by-default crate feature flag?

At the end of the day, I don't need PyStringData for PyOxidizer (I can reimplement that myself). But I do need the FFI definitions because only 1 crate can define bindings/linking to libpython, so PyO3 does need to define some of these Unicode symbols for PyOxidizer.

@decathorpe
Copy link
Contributor Author

Thanks for looking into this!

Yay for C being under-specified :/

My thoughts exactly :(

I think it is also reasonable to escalate our concern to the CPython folks and ask them to expose proper C functions (not macros) for accessing the content behind bitfields so downstream consumers don't have to reason about under-defined C semantics.

This sounds like a good idea, and probably the only "safe" way to handle this (since the compiler that's generating the accessor functions is then always the same as the one that's generating the bitfield). I've also pinged the Red Hat / Fedora CPython maintainers about this. They tell me it's way too late to get new API into Python 3.10 at this point, so it would happen for 3.11 at the earliest.

In the meantine, limiting this code to #[cfg(target_endian = "...")] is probably a good workaround.

@davidhewitt
Copy link
Member

davidhewitt commented Aug 26, 2021

Right, having gone away to think about this for a couple days, I've decided that I'm ok to go with the #[cfg(target_endian = "...")] approach.

I'm still a little uneasy about it, because C bitfields being not well-defined (even if they're in practice endian-specific) means this might still cause problems on other platforms. However you folks as users seem to be ok with it, and ultimately PyO3 is built for its users. 😄

If we start getting further trouble reported from this API I might revisit this later on. Also we should consider submitting a PR upstream to CPython adding function-based access for this so there's no need to depend on bitfields in the Python public interface.

@indygreg
Copy link
Contributor

I filed https://bugs.python.org/issue45025 so the problems with bit fields in CPython's API are tracked with CPython.

@davidhewitt
Copy link
Member

The 0.14.4 release is now live. How do you folks feel if I yank the 0.14.3 release?

@decathorpe
Copy link
Contributor Author

Looks like in pyo3 0.14.4 there's some missing pieces to making the API unavailable on big-endian architectures:

error[E0432]: unresolved import `self::string::PyStringData`
  --> src/types/mod.rs:27:9
   |
27 | pub use self::string::PyStringData;
   |         ^^^^^^^^^^^^^^------------
   |         |             |
   |         |             help: a similar name exists in the module: `PyString`
   |         no `PyStringData` in `types::string`
error[E0425]: cannot find function, tuple struct or tuple variant `PyUnicode_IS_READY` in this scope
   --> src/ffi/cpython/unicodeobject.rs:218:19
    |
218 |     debug_assert!(PyUnicode_IS_READY(op) != 0);
    |                   ^^^^^^^^^^^^^^^^^^ not found in this scope
warning: unused import: `crate::exceptions::PyUnicodeDecodeError`
 --> src/types/string.rs:4:5
  |
4 | use crate::exceptions::PyUnicodeDecodeError;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default
warning: unused import: `std::ffi::CStr`
  --> src/types/string.rs:12:5
   |
12 | use std::ffi::CStr;
   |     ^^^^^^^^^^^^^^
error: aborting due to 2 previous errors; 2 warnings emitted

At the very least, the following two things also need to be scoped to not(target_endian = "big"):

  • src/types/mod.rs: import statement on line 27, pub use self::string::PyStringData;
  • src/ffi/cpython/unicodeobject.rs: function on line 217, pub unsafe fn PyUnicode_GET_LENGTH (looks like this one was missed, because the function PyUnicode_IS_READY has the cfg attribute)

@indygreg
Copy link
Contributor

The 0.14.4 release is now live. How do you folks feel if I yank the 0.14.3 release?

I'm strongly against yanking because it breaks cargo install unless you use cargo install --locked. IMO this is user hostile and yanking should only be reserved for cases where that hostility to dependent applications is justified given the severity of the issues you are trying to prevent by blocking installs. In my mind, that bar is extremely high. Like malware detected in the application high.

@davidhewitt
Copy link
Member

Looks like in pyo3 0.14.4 there's some missing pieces

🤦 yikes, I'll fix and release 0.14.5 soon - #1850

I'm strongly against yanking

Me too; my opinion is similarly it's just for security / major soundness holes. As it's a new niche API which had issues, and they probably would lead to loud crashes in most cases, it's likely ok not to yank here.

@vstinner
Copy link

I think it is also reasonable to escalate our concern to the CPython folks and ask them to expose proper C functions (not macros) for accessing the content behind bitfields so downstream consumers don't have to reason about under-defined C semantics.

Which functions should be exposed? Some macros mostly only exist to be used by other macros.

For example, PyUnicode_IS_COMPACT_ASCII() is mostly used by PyUnicode_WSTR_LENGTH() macro (which is now deprecated).

@indygreg
Copy link
Contributor

The only ones we care about at the moment are the ones needed to peek at the underlying byte slices and interpret its storage.

So PyUnicode_GET_LENGTH(), PyUnicode_DATA(), PyUnicode_KIND().

@vstinner
Copy link

PyUnicode_GET_LENGTH(): you can use PyUnicode_GetLength().

@davidhewitt
Copy link
Member

Release 0.14.5 is live - hopefully Fedora is now able to package again on s390x @decathorpe... 🤞

@decathorpe
Copy link
Contributor Author

Everything went well, update is submitted: https://bodhi.fedoraproject.org/updates/FEDORA-2021-290f13120c

Thanks for your help!

@mejrs mejrs mentioned this issue Jun 20, 2022
3 tasks
@cuviper
Copy link

cuviper commented Jan 27, 2023

FWIW, you could also write your own C-FFI shims, compiled in build.rs via cc, so you don't have to wait for CPython.

bors bot added a commit that referenced this issue Mar 28, 2023
3015: Implement wrapper for `PyASCIIObject.state` bitfield accesses r=davidhewitt a=decathorpe

This is a first draft of my attempt to fix #1824 "properly" by writing a C wrapper for the `PyASCIIObject.state` bitfield accesses, as proposed here: #1824 (comment)

---

The original argument for making these functions `unsafe` is still valid, though - bitfield memory layout is not guaranteed to be stable across different C compilers, as it is "implementation defined" in the C standard. However, short of having CPython upstream provide non-inlined public functions to access this bitfield, this is the next best thing, as far as I can tell.

I've removed the `#[cfg(target_endian = "little")]` attributes from all things that are un-blocked by fixing this issue on big-endian systems, except for three tests, which look like expected failures considering that they do not take bit/byte order into account (for example, when writing to the bitfield).

- `ffi::tests::ascii_object_bitfield`
- `types::string::tests::test_string_data_ucs2_invalid`
- `types::string::tests::test_string_data_ucs4_invalid`

All other tests now pass on both little-endian and big-endian systems.

---

I am aware that some parts of this PR are probably not in a state that's acceptable for merging as-is, which is why I'm filing this as a draft. Feedback about how to better integrate this change with pyo3-ffi would be great. :)

In particular, I'm unsure whether the `#include` statements in the C files are actually correct across different systems. I have only tested this on Fedora Linux so far.

I'm also open to changing the names of the C functions that are implemented in the wrapper. For now I chose the most obvious names that shouldn't cause collisions with other symbols.

Co-authored-by: Fabio Valentini <decathorpe@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants