Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement wrappers for function-like macros #139

Open
multimeric opened this issue Feb 26, 2023 · 9 comments
Open

Implement wrappers for function-like macros #139

multimeric opened this issue Feb 26, 2023 · 9 comments

Comments

@multimeric
Copy link
Member

I'm interested in accessing some reference counting functionality in R, but I noticed that e.g. MAYBE_SHARED, NO_REFERENCES have no bindings generated. This is of course because of the fact that bindgen can't (yet) generate macros for function-like macros: rust-lang/rust-bindgen#753.

Therefore I'm thinking it might be nice to slowly add some manual implementations of these macros. The simple ones like #define CHAR(x) R_CHAR(x) are of lower priority to me since they don't represent missing functionality, but stuff like # define MAYBE_SHARED(x) (NAMED(x) > 1) seems useful to me, as are more complex ones like KNOWN_SORTED.

The only question is whether we want these to live in libR-sys as opposed to extendr, since they're not generated and would be the first manual functions in the crate. Opinions are welcome.

@CGMossa
Copy link
Member

CGMossa commented Feb 26, 2023 via email

@yutannihilation
Copy link
Contributor

A quick note. In theory, function-like macros can be caught by clang (I mean, outside of bindgen)

https://docs.rs/clang/latest/clang/struct.Entity.html#method.is_function_like_macro

but it seems there's no such symbols like MAYBE_SHARED here. So I have no idea how to automate this wrapper generation.

libR-sys/build.rs

Lines 444 to 453 in 183b791

for e in e {
match e.get_kind() {
EnumDecl | FunctionDecl | StructDecl | TypedefDecl | VarDecl => {
if let Some(n) = e.get_name() {
allowlist.insert(n);
}
}
ek => panic!("Unknown kind: {:?}", ek),
}
}

The version of R_CHAR() we can use is not a function-like macro, btw.

@multimeric
Copy link
Member Author

multimeric commented Mar 12, 2023

Yeah so it has hooks for when it encounters a function-like macro, but it can't automatically convert it to Rust, so we're back at the manual implementation option (see my link above). You would think it possible to parse the macro like a separate preprocessor language and then convert to Rust but maybe not.

@yutannihilation
Copy link
Contributor

yutannihilation commented Mar 12, 2023

I mean, I'm slightly against the hand-crafted implementations as it sounds difficult to maintain reliably.

Just in case you might not notice, we use the clang crate outside of bindgen. As bindgen cannot distinguish symbols from the C standard library, we parse the header file and construct the allowlist that bindgen uses.

@CGMossa
Copy link
Member

CGMossa commented Apr 16, 2023

I've spoken to someone and they recommend manually implementing these via macro_rules or even const fn.

I think we can do that.

To ensure that we don't go out of date, we can use clang to extract macros in the bindgen process and commit that to git. Changes in r-devel will appear and we can adjust accordingly.

Here's my attempt at extracting macros. There are things missing but this shows how feasible this approach can be:
r_macros.txt

  • I need to account for macro expansions
  • I need to ensure that I get skipped stuff as well
  • Remove duplicates.. it's not straight forward.

@CGMossa
Copy link
Member

CGMossa commented Apr 16, 2023

The version of the macro-processing I have right now produces these files:

This one captures all macros and also the preprocessing spots that they reside in (no matter what:
macros_and_skipped_ranges.txt

This one is supposed to only provide the ones with function-like syntax.
macros_and_skipped_ranges.txt

@CGMossa
Copy link
Member

CGMossa commented Apr 17, 2023

I took a stab at it. This needs revision and of course more tricks to make it compatible with the different targets we have, but it atleast adds a little clarity to the situation

Details

#[allow(dead_code)]
#[inline]
unsafe fn ISNA(x: f64) -> i32 {
    R_IsNA(x)
}

trait RExt<Arg = Self> {
    unsafe fn is_nan(x: Arg) -> bool;
}
impl RExt<Self> for f64 {
    #[inline]
    unsafe fn is_nan(x: Self) -> bool {
        // _isnan(x) != 0
        __isnan(x) != 0
    }
}
impl RExt<Self> for f32 {
    #[inline]
    unsafe fn is_nan(x: Self) -> bool {
        // _isnanf(x) != 0
        __isnanf(x) != 0
    }
}
impl RExt<Self> for u128 {
    #[inline]
    unsafe fn is_nan(x: Self) -> bool {
        __isnanl(x) != 0
    }
}

#[allow(dead_code)]
#[inline]
pub unsafe fn R_FINITE(x: f64) -> i32 {
    R_finite(x)
}

#[inline]
pub unsafe fn R_Calloc<T>(n: usize) -> *mut T {
    let size = std::mem::size_of::<T>();
    let ptr = unsafe { R_chk_calloc(n, size) } as *mut T;
    #[allow(clippy::let_and_return)]
    ptr
}

#[inline]
pub unsafe fn R_Realloc<T>(ptr: *mut T, n: usize) -> *mut T {
    let size = std::mem::size_of::<T>() * n;
    let new_ptr = unsafe { R_chk_realloc(ptr as *mut c_void, size) } as *mut T;
    #[allow(clippy::let_and_return)]
    new_ptr
}

#[inline]
pub unsafe fn R_Free<T>(ptr: *mut T) {
    R_chk_free(ptr as *mut c_void);
}

#[inline]
pub unsafe fn Memcpy<T>(dst: *mut T, src: *const T, n: usize) {
    std::ptr::copy_nonoverlapping(src, dst, n);
}

#[inline]
pub unsafe fn Memzero<T>(dst: *mut T, n: usize) {
    std::ptr::write_bytes(dst, 0, n);
}

#[inline]
pub unsafe fn CallocCharBuf(n: usize) -> *mut char {
    R_Calloc::<char>(n + 1)
}

// omitting the fortran macros

#[inline]
pub unsafe fn CHAR(x: SEXP) -> *const c_char {
    unsafe { R_CHAR(x) }
}

#[inline]
pub unsafe fn IS_SIMPLE_SCALAR(x: SEXP, type_: c_int) -> bool {
    (IS_SCALAR(x, type_) != 0) && (ATTRIB(x) == R_NilValue)
}

// region: SWITCH_TO_REFCNT

// #[cfg(feature = "switch_to_refcnt")]
// pub mod switch_to_refcnt {
// use super::*;
// use super::NAMEDMAX as NAMEDMAX_;
// pub const NAMEDMAX: c_int = NAMEDMAX_ as _;

//TODO: ensure that INCREMENT_NAMED and DECREMENT_NAMED doesn't work

#[inline]
pub unsafe fn INCREMENT_NAMED(x: SEXP) {
    if NAMED(x) != NAMEDMAX as c_int {
        SET_NAMED(x, NAMED(x) + 1);
    }
}

#[inline]
pub unsafe fn DECREMENT_NAMED(x: SEXP) {
    let n = NAMED(x);
    if n > 0 && n <= NAMEDMAX as c_int {
        SET_NAMED(x, n - 1);
    }
}

// /* Macros for some common idioms. */
#[inline]
pub unsafe fn MAYBE_SHARED(x: SEXP) -> bool {
    // # define MAYBE_SHARED(x) (NAMED(x) > 1)
    REFCNT(x) > 1
}

#[inline]
pub unsafe fn NO_REFERENCES(x: SEXP) -> bool {
    // # define NO_REFERENCES(x) (NAMED((x) ==) 0)
    REFCNT(x) == 0
}

#[inline]
pub unsafe fn MAYBE_REFERENCED(x: SEXP) -> bool {
    !NO_REFERENCES(x)
}

#[inline]
pub unsafe fn NOT_SHARED(x: SEXP) -> bool {
    !MAYBE_SHARED(x)
}

// endregion

#[inline]
pub unsafe fn cons(a: SEXP, b: SEXP) -> SEXP {
    Rf_cons(a, b)
}

#[inline]
pub unsafe fn lcons(a: SEXP, b: SEXP) -> SEXP {
    Rf_lcons(a, b)
}

#[inline]
pub unsafe fn PROTECT(s: SEXP) -> SEXP {
    Rf_protect(s)
}

#[inline]
pub unsafe fn UNPROTECT(n: c_int) {
    Rf_unprotect(n)
}

#[inline]
pub unsafe fn UNPROTECT_PTR(s: SEXP) {
    Rf_unprotect_ptr(s)
}

#[inline]
pub unsafe fn REPROTECT(x: SEXP, i: PROTECT_INDEX) {
    R_Reprotect(x, i)
}

#[inline]
pub unsafe fn KNOWN_SORTED(sorted: c_int) -> bool {
    use _bindgen_ty_1::*;
    (sorted == SORTED_DECR)
        || (sorted == SORTED_INCR)
        || (sorted == SORTED_DECR_NA_1ST)
        || (sorted == SORTED_INCR_NA_1ST)
}

#[inline]
pub unsafe fn KNOWN_NA_1ST(sorted: c_int) -> bool {
    use _bindgen_ty_1::*;
    (sorted == SORTED_INCR_NA_1ST) || (sorted == SORTED_DECR_NA_1ST)
}
#[inline]
pub unsafe fn KNOWN_INCR(sorted: c_int) -> bool {
    use _bindgen_ty_1::*;
    (sorted == SORTED_INCR) || (sorted == SORTED_INCR_NA_1ST)
}
#[inline]
pub unsafe fn KNOWN_DECR(sorted: c_int) -> bool {
    use _bindgen_ty_1::*;

    (sorted == SORTED_DECR) || (sorted == SORTED_DECR_NA_1ST)
}

// include\Rinternals.h
#[inline]
pub unsafe fn error_return(msg: *const c_char) -> SEXP {
    Rf_error(msg);
    return R_NilValue;
}

// include\Rinternals.h
#[inline]
pub unsafe fn errorcall_return(cl: SEXP, msg: *const c_char) -> SEXP {
    Rf_errorcall(cl, msg);
    return R_NilValue;
}

// include\Rinternals.h
#[inline]
pub unsafe fn BCODE_CONSTS(x: SEXP) -> SEXP {
    // re-enable in Defn.h after removing here
    CDR(x)
}
// include\Rinternals.h
#[inline]
pub unsafe fn PREXPR(e: SEXP) -> SEXP {
    R_PromiseExpr(e)
}

// include\Rinternals.h
#[inline]
pub unsafe fn BODY_EXPR(e: SEXP) -> SEXP {
    R_ClosureExpr(e)
}

@CGMossa
Copy link
Member

CGMossa commented Apr 17, 2023

I don't feel like there is much to do with the clang ~> Macro stuff.
Together with the fact that ref-count doesn't work unless R is compiled with that feature, I don't see a particularly good reason to dive more into this..

@CGMossa
Copy link
Member

CGMossa commented May 10, 2024

There are progress out there in the ecosystem, see https://github.com/reitermarkus/cmacro-rs

To that end, the author of cmacro-rs has a PR for `bindgen:

For now, I actually think we can manually write these macros, if we wanted to take that path. And frankly, I think we could.. I'll think about a possible way forward for that soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants