Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bindgen support #85

Closed
madsmtm opened this issue Dec 3, 2021 · 15 comments
Closed

bindgen support #85

madsmtm opened this issue Dec 3, 2021 · 15 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@madsmtm
Copy link
Owner

madsmtm commented Dec 3, 2021

See rust-lang/rust-bindgen#109.

I would like to create a (temporary?) fork of bindgen that generates bindings targetting objc2. In particular, Encode and RefEncode impls, but the Message/MessageReceiver stuff has also changed.

In the end, what I really want is the ability to generate safe apis, but that's probably not gonna happen, it would at least require us to enrich every method with mutability information (NSData -length is immutable, while NSMutableData -setLength: is not), and we're never going to be able to generate a safe API for e.g. NSData -initWithBytesNoCopy:length:deallocator:.

@madsmtm madsmtm added enhancement New feature or request help wanted Extra attention is needed labels Dec 3, 2021
@madsmtm
Copy link
Owner Author

madsmtm commented Dec 3, 2021

Maybe bindgen is really only useful for -sys crates. Though that shouldn't be undermined, it would be a huge help instead of the current situation where you have to verify every single argument by hand!

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 3, 2021

Ideally the design of the bindings should make it easily possible to conform to our preferred design (that we haven't chosen yet, see #58).

Idea: In the end we have something like this:

pub struct NSBundle(ffi::NSBundle);

unsafe impl objc::Message for NSBundle {}

impl std::ops::Deref for NSBundle {
    type Target = NSObject;
    fn deref(&self) -> &Self::Target {
        NSObject::from_raw(*self.0) // Uses ffi::NSBundle's Deref impl
    }
}

impl NSBundle {
    pub fn new_with_path(path: &Path) -> Id<Self, Owned> {
        let nsstring = NSString::from(path);
        // NSString into ffi::NSString
        let obj = unsafe { ffi::NSBundle::alloc().initWithPath_(nsstring.into()) } as *mut NSBundle;
        Id::new(NonNull::new(obj).unwrap())
    }
    // new_with_url

    pub fn main() -> Option<&'static Self> { // Or Id or autorelease stuff
        let obj = unsafe { ffi::NSBundle::mainBundle() } as *mut NSBundle;
        unsafe { NonNull::new(obj).as_ref() }
    }

    pub fn path(&self) -> &NSString {
        unsafe { NSString::from_raw(self.0.bundlePath()) }
    }
}

Which uses the automatically generated bindings. The difference is as follows:

 #[repr(transparent)]
-pub struct NSBundle(NSObject);
+pub struct NSBundle(ffi::NSBundle);
 
 unsafe impl objc::Message for NSBundle {}
 
 impl std::ops::Deref for NSBundle {
     type Target = NSObject;
     fn deref(&self) -> &Self::Target {
-        self.0
+        NSObject::from_raw(*self.0) // Uses ffi::NSBundle's Deref impl
     }
 }
 
 impl NSBundle {
     pub fn new_with_path(path: &Path) -> Id<Self, Owned> {
         let nsstring = NSString::from(path);
-        let obj: *mut NSBundle = unsafe { msg_send![class!(NSBundle), alloc] };
-        let obj: *mut NSBundle = unsafe { msg_send![obj, initWithPath: &nsstring] };
+        // NSString into ffi::NSString
+        let obj = unsafe { ffi::NSBundle::alloc().initWithPath_(nsstring.into()) } as *mut NSBundle;
         Id::new(NonNull::new(obj).unwrap())
     }
     // new_with_url
 
     pub fn main() -> Option<&'static Self> { // Or Id or autorelease stuff
-        let obj: *mut NSBundle = unsafe { msg_send![class!(NSBundle), mainBundle] };
+        let obj = unsafe { ffi::NSBundle::mainBundle() } as *mut NSBundle;
         unsafe { NonNull::new(obj).as_ref() }
     }
 
     pub fn path(&self) -> &NSString {
-        unsafe { msg_send![self, bundlePath] }
+        unsafe { NSString::from_raw(self.0.bundlePath()) }
     }
 }

The benefit here is that message sends would be properly typed, and hence can't go horribly wrong like they usually can; however, it also adds an extra layer of indirection, extra compile time and just extra stuff you have to know about. So I'm not sure it's worth it like this :/

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 3, 2021

I think a place where it could be really useful would be in helping with upholding the memory management rules; e.g. returning proper types for alloc methods (so you can't accidentally call something before properly initializing it), ensuring init, copy, mutableCopy and new methods return Id/Retained-like types, and helping with autorelease stuff on all other methods.

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 3, 2021

For example, the code above is probably wrong, because bundlePath returns an autoreleased NSString (header says @property(readonly, copy) NSString *bundlePath).

So if the bindings could have generated this (or however we do the autoreleasing) instead:

pub unsafe fn bundlePath(&self, &'pool AutoreleasePool) -> &'pool NSString { ... }

It would have done me a favor!

@madsmtm madsmtm changed the title bindgen support bindgen support? Dec 3, 2021
@madsmtm
Copy link
Owner Author

madsmtm commented Dec 3, 2021

But again, an alternative is just to use macros for most of this stuff, like objrs does.

@simlay
Copy link
Collaborator

simlay commented Dec 7, 2021

Hey hey, just wanted to drop in and share some thoughts. Full disclosure... I've not really written much objective-c. Most of my knowledge of objective-c comes from adding the not-perfect-but-better-than-it-was objective-c support for bindgen so I could be wrong about some things.

I would like to create a (temporary?) fork of bindgen that generates bindings targetting objc2

I never got around to finishing/merging rust-lang/rust-bindgen#1784 but this adds the hierarchy inheritance to the categories from the objective-c. The biggest issue with this PR is that it has to do a linear search on all the objects to find the ty which is a costly look up. I found it rather necessary for doing a lot of calls.

Maybe bindgen is really only useful for -sys crates. Though that shouldn't be undermined, it would be a huge help instead of the current situation where you have to verify every single argument by hand!

Yeah, this is how I felt about it. I spent a bunch of time doing my best to have the objective-c bindgen'd stuff be as ergonomic as I could without adding anything unsound or causing UB. I really like the #[transparent] structs that are generated with a pointer to the underlying objective-c but allows for the rust to semi-typecheck stuff.

I think a place where it could be really useful would be in helping with upholding the memory management rules; e.g. returning proper types for alloc methods (so you can't accidentally call something before properly initializing it), ensuring init, copy, mutableCopy and new methods return Id/Retained-like types, and helping with autorelease stuff on all other methods.

I tried to add impl Drop and impl Clone for objective-c structs via the "manual retain-release" counter calls. For me, this caused memory leaks as the counter could increase when an object was passed over to the objective-c side. If I recall correctly, when I passed in a UIButton to a UIView the button's counter would increase by one more than expected and this would cause a memory leak. I admit that I didn't have/implement more fine grained access things like mutableCopy and such which may have been part of this. I also admit to not actually knowing how to deal with objective-c memory management well.

Random thoughts:

  • I also spent a couple days trying to figure out how to use the Nullability but couldn't figure out how to get it out of clang/llvm.
  • With uikit-sys, when building it, it runs bindgen on just about every framework that uikit depends on which is a lot and this causes the compile time to be ridiculous (~3 minutes for my macbook pro). Using a blocklist or something might be helpful. I just never got around to it.
  • I feel like I recall seeing stuff about bindgen getting comments near a given call but I'm not sure how to get it out of the LLVM tree. It would be very helpful for the docs if it had the documentation around a given objective-c call. This would be super helpful for the user of a given -sys crate as they wouldn't have to have two sets of documentation open (the rust and the objective-c).
  • I really like the From and TryFrom implementations for various structs to allow for upcasting/downcasting. Now that I re-think it, I'd actually have some of the generated instantiation calls like initWithFrame_(self, frame: CGRect) actually do something like initWithFrame_(self, frame: S) where S: Into<CGRect>
  • Not to say that bindgen should be used for uikit but the fact that the uikit-sys crate has ~50 traits for UITextView definitely is overwhelming. I'm not sure what to do about this type of problem though.

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 10, 2021

Thanks for your input @simlay! I've also been looking at your bindgen implementations, they're definitely a nice starting point! I've also read your blog posts, actually that's a really good way to document your progress!

I tried to add impl Drop and impl Clone for objective-c structs via the "manual retain-release" counter calls

I think I've improved the situation in this crate, e.g. making it so that you don't accidentally pass Drop types to msg_send! calls.

Rough idea for how I would generate the bindings:

pub struct UIView(Object); // Instead of `id`; this means you would use `*mut UIView` where `UIView` was previously expected.

// Would be defined in `objc`
pub struct Alloc<T>(NonNull<T>);
pub struct Retained<T>(NonNull<T>);

impl IUIView {
    // Returns a type that can only be used after calling `init` on it.
    pub unsafe fn alloc() -> Alloc<Self> { ... }

    // Returns a reference-counting wrapper.
    // Need to figure this out some more, see #87
    pub unsafe fn init(s: Alloc<Self>) -> Retained<Self> { ... }

    // Alternative:
    // - self: *mut Self
    // - view: *mut UIView
    // - view: Retained<UIView> (if `addSubview:` took responsibility of releasing the object; in this case, it doesn't)
    pub unsafe fn addSubview_(&self, view: &UIView) { ... }
}

how to use the Nullability

Yeah, I can't figure that out either, I think they're bundled into the CXCursor_UnexposedAttr, changes to libclang would probably need to happen if we were to extract these.

runs bindgen on just about every framework that uikit depends

I would also really like to improve this! Especially needed if we were to have e.g. foundation-sys, it would be nice to have uikit-sys depend on the types from foundation-sys instead of defining them itself.

compile time to be ridiculous (~3 minutes for my macbook pro)

I would probably pre-generate the bindings using the latest SDKs, and try using the __API_AVAILABLE / __OS_AVAILABLE / NS_AVAILABLE / ... macros to add availability hints that would make the functions unavailable when MACOSX_DEPLOYMENT_TARGET / IPHONEOS_DEPLOYMENT_TARGET / ... is set too low.

Another idea is to use cargo features, where enabling e.g. the nsstring feature would only include <Foundation/NSString.h>.

comments near a given call

Yup, that would be reeeeally nice!

I really like the From and TryFrom implementations for various structs to allow for upcasting/downcasting. Now that I re-think it, I'd actually have some of the generated instantiation calls like initWithFrame_(self, frame: CGRect) actually do something like initWithFrame_(self, frame: S) where S: Into<CGRect>

I'm not totally sure TryFrom is safe, see discussion in #58, something something lifetimes, but will definitely consider it. Another argument against it is that it's extra code for something I suspect you use relatively rarely (and you'd have to implement in your safe wrapper as well), but I could very well be wrong about that!
Agree about the where S: Into<CGRect> though!

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 10, 2021

I think I found a way forward on the nullability, see rust-lang/rust-bindgen#1791 (comment)!

@madsmtm
Copy link
Owner Author

madsmtm commented Sep 4, 2022

Time has passed, things change, new ideas pop up:

I recently migrated winit to objc2, and in the process ended up having to define a lot of AppKit classes there - while the verify_message feature helps here, it is of course quite non-ideal. I'm similarly in the process of converting the metal crate, and that is also just a huge blob of bindings; would be lovely to have these auto-generated!

The main reason I've avoided bindgen in the past is just that I haven't really had the time to get into it, but also that I wasn't really happy with the design in #93 (a lot of unnecessary unsafe, and unclean to write a wrapper over). But today I had a proper look at how Swift does it, and I think we can snag a few tricks from them:

  1. They use a .apinotes (clang impl, clang headers) file to tell them extra information like function/class names and nullability. We could do something similar as well, but to tell us whether an API is unsafe or requires &mut [1]:

    ---
    Name: Foundation
    Classes:
    - Name: NSString
      Methods:
      - Selector: 'new'
        Unsafe: N
      - Selector: 'stringByAppendingString:'
        Unsafe: N # Assuming that (non-)nullability is properly set
      - Selector: 'stringByAppendingPathComponent:'
        Unsafe: N
      - Selector: 'lengthOfBytesUsingEncoding:'
        Unsafe: N # Assuming `NSStringEncoding` can be made safe
      - Selector: 'length'
        Unsafe: N
      - Selector: 'UTF8String'
        Unsafe: N # Safe to call, but the returned pointer may not be safe to use
      - ...
    - Name: NSMutableString
       Methods:
       - Selector: 'new'
         Unsafe: N
       - Selector: 'initWithString:'
         Unsafe: N
       - Selector: 'appendString:'
         Unsafe: N
         Mutating: Y
       - Selector: 'setString:'
         Unsafe: N
         Mutating: Y
    # ...
    [1] I know you considered this somewhat already in https://github.com/Objective-C support rust-lang/rust-bindgen#109#issuecomment-675251983, but it hadn't really clicked for me until now!
  2. The add an "overlay" to Foundation (see e.g. the Foundation overlay). We could do something similar, so instead of redefining the types and making safe wrappers over it, we'd just add additional functionality to them in the same crate.

    // overlay/string.rs
    use crate::bindings::{NSString, NSMutableString};
    
    // SAFETY: `NSString` is immutable, and all (safe) mutating
    // methods in `NSMutableString` are changed to take `&mut`.
    unsafe impl Send for NSString {}
    unsafe impl Sync for NSString {}
    unsafe impl Send for NSMutableString {}
    unsafe impl Sync for NSMutableString {}
    
    // Define a few nice-for-Rust aliases / helpers
    impl NSString {
        // `initWithBytes:length:encoding:` is not safe, so we define a helper here
        pub fn from_str(string: &str) -> Id<Self, Shared> {
            unsafe {
                Self::initWithBytes_length_encoding(
                    Self::alloc(),
                    string.as_ptr().cast(),
                    string.len(),
                    NSUTF8StringEncoding,
                )
            }
        }
    
        /// The number of UTF-8 code units in the string.
        pub fn len(&self) -> usize {
            // Use bindgen method
            self.lengthOfBytesUsingEncoding(NSUTF8StringEncoding)
        }
    
        pub fn is_empty(&self) -> bool {
            self.len() == 0
        }
    
        pub fn len_utf16(&self) -> usize {
            self.length()
        }
    
        // `UTF8String` is not easily used correctly, so we define a helper here to convert
        // to a string, and give it the right lifetime.
        pub fn to_str<'r, 's: 'r, 'p: 'r>(&'s self, _pool: &'p AutoreleasePool) -> &'r str {
            let bytes = self.UTF8String();
            let bytes: *const u8 = bytes.cast();
            let len = self.len();
            let bytes: &'r [u8] = unsafe { slice::from_raw_parts(bytes, len) };
            str::from_utf8(bytes).unwrap()
        }
    }
    
    impl fmt::Display for NSString {
        ...
    }
    
    impl PartialEq/PartialOrd/fmt::Write/...

This could also be used to e.g. prevent creating NSWindow/NSView on other threads than the main thread: We could make all the accessor methods safe and the creation method unsafe, and then manually define a creation method which also takes MainThreadMarker.
This could also be used to make declare_class! safe(er), if we can get bindgen to extract knowledge about the NSWindowDelegate class as well (and do a runtime check to ensure they're doing things right).

Looking back, I'm glad I've up until now gone the route of "let the user do things themselves", since that has led to the invaluable msg_send_id! macro and Ivar<T> helpers in declare_class!, but forcing the user define AppKit themselves is untenable in the long term!


So yeah, just a long speech to motivate myself to prioritize this damn bindgen work!

@madsmtm madsmtm changed the title bindgen support? bindgen support Sep 4, 2022
@madsmtm
Copy link
Owner Author

madsmtm commented Sep 4, 2022

Also of note: the windows(-sys) crate won over winapi because 1: It was official and 2: It generates the bindings automatically, instead of users manually writing them. win32metadata might be worth getting some ideas from.

@madsmtm
Copy link
Owner Author

madsmtm commented Sep 4, 2022

Note: Ideally I'd like to parse directives like #if TARGET_OS_OSX and convert it into relevant #[cfg(target_os = "macos")], such that we can simply emit and publish the bindings once instead of the user having to invoke clang to get the correct definitions. That's an entirely different beast from bindgen though, will try to look around at what's done elsewhere, for example c2rust.

Note that I'd be fine with creating an internal tool, and then generating and publishing crates for bindings to all (or at least most) of Apple's official Objective-C frameworks using that (stated differently; if it's just changes to bindgen, they don't have to be upstreamed to be useful!)

@madsmtm
Copy link
Owner Author

madsmtm commented Sep 4, 2022

After an unreasonable amount of time, I managed to track down where the "Developer Documentation" is stored: /Applications/Xcode.app/Contents/SharedFrameworks/DNTDocumentationSupport.framework/Versions/A/Resources/ (on Xcode 11 / Mojave at least). If we could use this to enrich the output, that would be awesome!

Other links, in case I need them at some point:
https://developer.apple.com/rss/com.apple.adc.documentation.AppleOSX10_8.atom
https://developer.apple.com/library/downloads/docset-index.dvtdownloadableindex
https://help.apple.com/xcode/mac/11.0/en.lproj/search.cshelpindex

Apple's old documentation on "docsets":
https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/Documentation_Sets/010-Overview_of_Documentation_Sets/docset_overview.html

And the newer "DocC":
https://developer.apple.com/documentation/docc

@simlay
Copy link
Collaborator

simlay commented Sep 7, 2022

Looking back, I'm glad I've up until now gone the route of "let the user do things themselves", since that has led to the invaluable msg_send_id! macro and Ivar<T> helpers in declare_class!, but forcing the user define AppKit themselves is untenable in the long term!

This is pretty much why I got sucked into adding so much Objective-C support to bindgen. The part that I also found untenable was a reasonable way to make the bindings safe.

Anyway, I like the idea having some generated safe bindings but I'm a bit cautious on how a given call is "safe" or not. I think you're right that using some sort of apinotes-like way to describe various safe/unsafe/mutable bindings seems reasonable.

Take for example the AppKit.apinotes. It's 10k lines and unclear what's auto-generated and what's not. I'm guessing a lot of the SwiftPrivate fields have some default and then someone came along and dealt with the exceptions by hand.

In this case, I'd suggest defaulting things to Unsafe: Y and then going through and ensuring things are actually safe. I'm not sure what Mutating would default to but right now all the bindgen bindings just have built in interior mutability by default so that's not ideal either.

One thing I've noticed in the last 2 years or so is that the changes to bindgen have somewhat slowed. It doesn't seem like there's huge refactors happening or significant features missing. So, I think it'd be somewhat reasonable to actually maintain a fork that does some interesting stuff using the better objc2 features or that could take some additional metadata/apinotes to describe additional attributes on various calls.

@madsmtm
Copy link
Owner Author

madsmtm commented Sep 8, 2022

It's 10k lines and unclear what's auto-generated and what's not. I'm guessing a lot of the SwiftPrivate fields have some default and then someone came along and dealt with the exceptions by hand.

I think a lot of it is autogenerated, yeah; we would have to do the unsafe verification by hand though, no way around that!

In this case, I'd suggest defaulting things to Unsafe: Y and then going through and ensuring things are actually safe.

Definitely unsafe by default. A nice thing about unsafe is that we can start off with the whole thing being unsafe, and then incrementally in minor versions slowly make various functions safe (at least I'm fairly certain, have asked this on the users forum to make sure).

I'm not sure what Mutating would default to but right now all the bindgen bindings just have built in interior mutability by default so that's not ideal either.

Probably default to not mutating, but I'm not sure here either. EDIT: I've opened #265 to ponder this further.

I think it'd be somewhat reasonable to actually maintain a fork that does some interesting stuff using the better objc2 features or that could take some additional metadata/apinotes to describe additional attributes on various calls.

See #264.

@madsmtm
Copy link
Owner Author

madsmtm commented Dec 23, 2022

#264 is done, so I'll close this.

I may return to bindgen one day, but likely not for a fairly long while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants