Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong ffi_type for unsigned numbers causes bugs on some platforms #1435

Open
Tajmoti opened this issue May 7, 2022 · 3 comments
Open

Wrong ffi_type for unsigned numbers causes bugs on some platforms #1435

Tajmoti opened this issue May 7, 2022 · 3 comments

Comments

@Tajmoti
Copy link

Tajmoti commented May 7, 2022

Provide complete information about the problem

  1. Version of JNA and related jars
    5.11.0@aar
  2. Version and vendor of the java virtual machine
    ART @ Android 11
  3. Operating system
    Android 11
  4. System architecture (CPU type, bitness of the JVM)
    arm64-v8a, x86_64, possibly more
    Output of clang --version:
    Compiler Android (7019983 based on r365631c3) clang version 9.0.9 (https://android.googlesource.com/toolchain/llvm-project a2a1e703c0edb03ba29944e529ccbf457742737b) (based on LLVM 9.0.9svn).
  5. Complete description of the problem
    Function argument of type unsigned short passed by JNA to native code does not behave correctly when the code is compiled with optimizations enabled. The is caused by the fact that LLVM is counting on the argument being stored in straight format (as it should be) instead of two's complement.
  6. Steps to reproduce
    Minimal reproducible example available here.
    The hello function in the following code should return true, but returns false.
const unsigned short CONSTANT = 0x8810u;

bool hello(unsigned short argument) {
    __android_log_print(ANDROID_LOG_WARN, TAG, "argument value as %%#010x=%#010x %%d=%d, as %%u=%u, as %%hu=%hu", argument, argument, argument, argument);
    __android_log_print(ANDROID_LOG_WARN, TAG, "CONSTANT value as %%#010x=%#010x %%d=%d, as %%u=%u, as %%hu=%hu", CONSTANT, CONSTANT, CONSTANT, CONSTANT);
    switch (argument) {
        case CONSTANT:
            return true;
        default:
            return false;
    }
}
    private void logValueFromJna() {
        boolean on = lib.hello(new UINT16(0x8810));
        Log.w("JNA-BUG", "hello returned correct value? " + on);
    }

    public static class UINT16 extends IntegerType {

        @SuppressWarnings("unused")
        public UINT16() {
            this(0);
        }

        public UINT16(int value) {
            super(2, value, true);
        }
    }

    interface TestLibrary extends Library {
        boolean hello(UINT16 argument);
    }

Log output with optimizations disabled:

W/JNA-BUG-NATIVE: argument value as %#010x=0x00008810 %d=34832, as %u=34832, as %hu=34832
W/JNA-BUG-NATIVE: CONSTANT value as %#010x=0x00008810 %d=34832, as %u=34832, as %hu=34832
W/JNA-BUG: hello returned correct value? true

Log output with optimizations enabled:

W/JNA-BUG-NATIVE: argument value as %#010x=0xffff8810 %d=-30704, as %u=4294936592, as %hu=34832
W/JNA-BUG-NATIVE: CONSTANT value as %#010x=0x00008810 %d=34832, as %u=34832, as %hu=34832
W/JNA-BUG: hello returned correct value? false
@Tajmoti
Copy link
Author

Tajmoti commented May 7, 2022

This problem can be fixed by correctly using ffi_type_uintX instead of ffi_type_sintX when calling dispatch for unsigned numbers. This is an example of the problematic line.

As an experiment, I have re-compiled JNA with that line changed to arg_types[i] = &ffi_type_uint16; and it did fix the problem (of course, it broke signed numbers).

To properly fix this, the information about the signedness would somehow need to be passed to the dispatch function.

Just to be clear, I am aware that this behavior is documented here under NOTES. However, I believe it should be changed because as a user of the library, I would expect it to behave correctly out of the box in a situation like this. I think it should not depend on the target architecture and JNA's assumptions about storing of unsigned numbers, and a possible mismatch of the two.

The experiment with the custom build of JNA is available in the custom-jna-fix branch of the minimum reproducible example repository.
Log output with with optimizations enabled and custom build of JNA that uses ffi_type_sint16 for unsigned shorts:

W/JNA-BUG-NATIVE: argument value as %#010x=0x00008810 %d=34832, as %u=34832, as %hu=34832
W/JNA-BUG-NATIVE: CONSTANT value as %#010x=0x00008810 %d=34832, as %u=34832, as %hu=34832
W/JNA-BUG: hello returned correct value? true

@Tajmoti Tajmoti changed the title Incorrect assumptions about passing of unsigned numbers to functions - causes bugs Wrong ffi_type for unsigned numbers causes bugs on some platforms May 7, 2022
@matthiasblaesing
Copy link
Member

Am I missing something obvious here? To my understanding the native interface is determined from the headerfiles and the platform ABI. The caller pushes the arguments for the function onto the stack/places them into the right registers and jumps to the target function. The callee then reads them from these places.

When you say it works with different optimiziation levels, then to my understanding the ABI of the called function changes and this does not sound like sane behavior for a compiler. I would then ask how runtime linking would work, as that will then also break.

@Tajmoti
Copy link
Author

Tajmoti commented May 11, 2022

It's not entirely obvious, but not too difficult either.

The problem is in the "caller pushes the arguments for the function onto the stack/places them into the right registers" part. This pushing of arguments is implemented using libffi. Libffi itself handles that correctly, provided the argument types passed to libffi are correct. Which is not the case for unsigned integers!

This is the line I am talking about. Notice that it always uses ffi_type_sint16 for shorts, even if the value came from an unsigned IntegerType of size 2.

arg_types[i] = &ffi_type_sint16;

I am assuming that in most cases, this naive approach works fine, but it is in fact relying on undefined behavior. And in my case, that breaks it. The compiler optimizations are actually correct here, which is proven by the fact that changing the line to ffi_type_uint16 fixed the problem for unsigned IntegerType of size 2,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants