Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkmf_config support for pkg-config files with multiple library directories #138

Open
mudge opened this issue Mar 17, 2024 · 11 comments
Open

Comments

@mudge
Copy link
Contributor

mudge commented Mar 17, 2024

As mentioned in mudge/re2#138 (comment), the re2.pc pkg-config file generated by RE2 contains libraries from two directories: RE2 itself and its sole dependency, Abseil (note I’ve populated PKG_CONFIG_PATH prior to this so that it prefers the Abseil in ports over my system install):

$ pkg-config --libs --static re2.pc
-L/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/libre2/2024-03-01/lib -L/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/abseil/20240116.1/lib -pthread -lre2 -labsl_flags_internal -labsl_flags_marshalling -labsl_flags_reflection -labsl_flags_private_handle_accessor -labsl_flags_commandlineflag -labsl_flags_commandlineflag_internal -labsl_flags_config -labsl_flags_program_name -labsl_cord -labsl_cordz_info -labsl_cord_internal -labsl_cordz_functions -labsl_cordz_handle -labsl_crc_cord_state -labsl_crc32c -labsl_crc_internal -labsl_crc_cpu_detect -labsl_raw_hash_set -labsl_hash -labsl_city -labsl_bad_variant_access -labsl_low_level_hash -labsl_hashtablez_sampler -labsl_exponential_biased -labsl_bad_optional_access -labsl_str_format_internal -labsl_synchronization -labsl_graphcycles_internal -labsl_kernel_timeout_internal -labsl_stacktrace -labsl_symbolize -labsl_debugging_internal -labsl_demangle_internal -labsl_malloc_internal -labsl_time -labsl_civil_time -labsl_strings -labsl_strings_internal -labsl_string_view -labsl_base -labsl_spinlock_wait -labsl_int128 -labsl_throw_delegate -labsl_raw_logging_internal -labsl_log_severity -labsl_time_zone

At the moment, this doesn’t feel compatible with MiniPortile2’s new mkmf_config(static:) functionality as it assumes the .pc file only specifies libraries within its own directory (see

) and that there is a single, entrypoint pc file for a library (Abseil ships with over 180 individual ones for each library).

What I actually want here is to turn all of the -l flags into static, absolute paths by looking for the matching file in each of the library directories specified by the two -L flags, e.g.

-lre2 -labsl_flags_internal

Becomes:

/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/libre2/2024-03-01/lib/libre2.a /Users/mudge/Projects/re2/ports/arm64-apple-darwin22/abseil/20240116.1/lib/libabsl_flags_internal.a

Would it be possible to extend the API to support this use case (potentially also including setting PKG_CONFIG_PATH)?

@mudge
Copy link
Contributor Author

mudge commented Mar 17, 2024

Looking at

#
# keep track of the libraries we're statically linking against, and fix up ldflags and
# libflags to make sure we link statically against the recipe's libaries.
#
# this avoids the unintentionally dynamically linking against system libraries, and makes sure
# that if multiple pkg-config files reference each other that we are able to intercept flags
# from dependent packages that reference the static archive.
#
, perhaps my particular issue is because Abseil comes with over 180 individual .pc files so there isn't a single file to reference and pre-populate $MINI_PORTILE_STATIC_LIBS with before loading RE2's re2.pc (viz. if I were able to use mkmf_config(static:) with Abseil first, perhaps using it with RE2 would be able to generate the correct static flags already).

That said, perhaps changing mkmf_config(static:) to look up libraries in every directory in $LIBPATH when trying to generate static links would cover what I need here?

@flavorjones
Copy link
Owner

Thank you for opening this! Hoping to looking into it in the next week.

@mudge
Copy link
Contributor Author

mudge commented Mar 19, 2024

FWIW we explicitly restrict our search for static replacements of libraries to the two lib directories of our recipes in re2 but I wonder if it would be as safe (and much more general) to use all of $LIBPATH especially if we’re prepending every new -L flag from the .pc file to it.

@mudge
Copy link
Contributor Author

mudge commented Mar 22, 2024

In case it is of any use, I had some time today to try and spike a mix of how your mkmf_config works and how we're currently handling it in re2 and came up with the following:

ENV["PKG_CONFIG_ALLOW_SYSTEM_CFLAGS"] = "t"
pkg_config_paths = [
  "#{abseil_recipe.path}/lib/pkgconfig",
  "#{re2_recipe.path}/lib/pkgconfig"
]

# Set up pkg-config to prefer our vendored RE2 and Abseil
pkg_config_paths.prepend(ENV['PKG_CONFIG_PATH']) if ENV['PKG_CONFIG_PATH']
ENV['PKG_CONFIG_PATH'] = pkg_config_paths.join(File::PATH_SEPARATOR)

pc_file = File.join(re2_recipe.path, 'lib', 'pkgconfig', 're2.pc')

# Collect static -L directories to resolve static libraries
library_dirs = xpopen(['pkg-config', '--libs-only-L', '--static', pc_file], err: %i[child out], &:read).strip
static_library_dirs = library_dirs.shellsplit.map { |flag| flag.sub(/\A-L/, "") }
puts "Static library dirs are #{static_library_dirs.inspect}"

# Convert -l flags to static links within the collected static directories where possible
libflags = xpopen(['pkg-config', '--libs-only-l', '--static', pc_file], err: %i[child out], &:read)
  .shellsplit
  .map do |flag|
    next flag unless flag.start_with?("-l")

    static_lib = "lib#{flag[2..]}.#{$LIBEXT}"
    static_lib_dir = static_library_dirs.find { |dir| File.exist?(File.join(dir, static_lib)) }
    next flag unless static_lib_dir

    File.join(static_lib_dir, static_lib)
  end

# Prepend all flags (including the static replacements) to $libs
$libs = [libflags, $libs].join(" ").strip

# Prepend the original -L flags to LDFLAGS otherwise the final linking step fails with "library 're2' not found"
$LDFLAGS = [library_dirs, $LDFLAGS].join(" ").strip

# Prepend INCFLAGS
incflags = xpopen(['pkg-config', '--cflags-only-I', pc_file], err: %i[child out], &:read).strip
$INCFLAGS = [incflags, $INCFLAGS].join(" ").strip

# Append CFLAGS and CXXFLAGS
cflags = xpopen(['pkg-config', '--cflags-only-other', pc_file], err: %i[child out], &:read).strip
$CFLAGS = [$CFLAGS, cflags].join(" ").strip
$CXXFLAGS = [$CXXFLAGS, cflags].join(" ").strip

Inspecting the resulting re2.bundle with otool on macOS gives the following:

$ otool -L lib/re2.bundle
lib/re2.bundle:
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

I'm not sure what the significance of having to set LDFLAGS as well as libs is however, perhaps you know better?

Update: it might be worth adding that we’re currently relying on dir_config to specify the location of RE2 (which seems to populate $LIBPATH with the lib directory and $CPPFLAGS with -I flags) but I’m hoping that’ll no longer be necessary if we can set things up right.

Here's another version of the last few lines that only sets $libs, $LIBPATH and $INCFLAGS and compiles successfully:

# Set $libs with all the flags including our static replacements
$libs = [libflags, $libs].join(" ").strip

# Tell Ruby where to find the libraries
$LIBPATH = static_library_dirs | $LIBPATH

# Add INCFLAGS
incflags = xpopen(['pkg-config', '--cflags-only-I', pc_file], err: %i[child out], &:read).strip
$INCFLAGS = [incflags, $INCFLAGS].join(" ").strip

@mudge
Copy link
Contributor Author

mudge commented Mar 23, 2024

I’ve now distilled this into mudge/re2@ac14e8f

@mudge
Copy link
Contributor Author

mudge commented Mar 24, 2024

It’s worth noting that preferring $libs over $LDFLAGS seems to cause a problem if you have the library in Ruby’s RbConfig::CONFIG['exec_prefix'] (a bug @stanhu found when working on re2) so I’ve since switched to setting $LDFLAGS directly which seems to resolve the problem. It’d be good to dig into why this happens though.

@flavorjones
Copy link
Owner

@mudge thanks for all your work here. i've been heads-down during my OSS time trying to work on a few urgent things and haven't been able to get back to this yet. this topic is at the top of my backlog, though, and I hope to get to it in the next week or two.

@mudge
Copy link
Contributor Author

mudge commented Mar 24, 2024

You’re welcome, there’s no rush from my side. I had some time this weekend to spend on it hence the over-communication both here and on my (now draft) PR in re2.

The main thing I’d like to understand more is the significance between $LDFLAGS and $libs and $LIBPATH for the library search paths and individual libraries as well as $INCFLAGS and $CPPFLAGS for include search paths (eyeballing the difference in them between my main branch and the new PR hasn’t been especially enlightening) and why it fixes the issue of libraries in exec_prefix being linked ahead of the static ones.

It has been tricky because so many MakeMakefile methods indirectly mutate one or more of the globals and that isn’t always easy to spot (e.g. have_library calling dir_config).

@mudge
Copy link
Contributor Author

mudge commented Mar 25, 2024

Quick update: I think I got to the bottom of why I saw a regression in re2 and it was to do with using have_library("re2") as it adds -lre2 to LIBS in the resulting Makefile (even though ldflags already has all the static libraries) which makes it possible for the gem to pick up any libre2.as elsewhere in the search path.

@mudge
Copy link
Contributor Author

mudge commented Apr 3, 2024

Having churned on this a bit, I ended up releasing re2 2.10.0 with the refactored pkg-config logic.

In summary:

  • The previous implementation was mostly taken from Nokogiri’s extconf.rb and used dir_config to populate $CPPFLAGS with the vendored dependencies’ include paths (and $LIBPATH with their lib paths) before parsing the result of pkg-config --libs --static path/to/lib/pkgconfig/re2.pc, prepending $LIBPATH with any vendored dependencies’ lib paths specified in -L flags and replacing all -l flags that could be found in a hard-coded list of lib paths with absolute (and therefore static) paths before adding them to $LDFLAGS.
  • I updated the code to much more closely resemble MiniPortile2’s mkmf_config and Ruby’s own MakeMakefile#pkg_config and instead prefer pkg-config’s more specific --libs-only-L, --libs-only-l, --libs-only-other, --cflags-only-I, --cflags-only-other to extract the lib paths without adding them to $LIBPATH, replace -l flags found in those directories with absolute, static paths, adding them to $libs (not $LDFLAGS), add any other linker flags (e.g. -pthread) to $LDFLAGS, add include paths to $INCFLAGS and everything else to $CFLAGS and $CXXFLAGS respectively.
  • The key thing here is that we never call dir_config either directly or indirectly through have_library, etc. when linking statically to avoid accidentally adding a dynamic -lre2 flag anywhere in the resulting Makefile.

This way, I hope that we include the bare minimum necessary and use the “highest” possible level of abstraction (e.g. the more intention-revealing $INCFLAGS vs the more generic $CPPFLAGS) for both the compilation ($INCFLAGS, $CFLAGS, $CXXFLAGS) and linking ($LDFLAGS, $libs) without adding unnecessary directories to the $LIBPATH and minimising the risk of linking a library dynamically rather than statically.

I’m not sure our strategy of searching for static libraries in every -L flag returned by pkg-config is generalisable for your API but hopefully some of this is useful if only to confirm your existing design decisions.

@mudge
Copy link
Contributor Author

mudge commented Apr 4, 2024

If nothing else, it’d be quite nice to delete my copy of minimal_pkg_config and rely on yours if it was made part of MiniPortile2’s public API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants