Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dep: add ruby 3.2 support #2732

Merged
merged 10 commits into from
Dec 28, 2022
Merged

dep: add ruby 3.2 support #2732

merged 10 commits into from
Dec 28, 2022

Conversation

flavorjones
Copy link
Member

@flavorjones flavorjones commented Dec 18, 2022

What problem is this PR intended to solve?

  • bump rake-compiler to 1.2.1
  • drop ruby 2.6 and jruby 9.3 support
  • add ruby 3.2 support and test coverage

Have you included adequate test coverage?

Yes.

Does this change affect the behavior of either the C or the Java implementations?

Yes, the CRuby precompiled extension for Darwin Ruby 3.2 will no longer export libxml2 or lbxslt symbols. See the decision record in this PR for details.

@flavorjones
Copy link
Member Author

Getting really close. I think only rake-compiler/rake-compiler-dock#87 is blocking us from being completely green.

@flavorjones flavorjones force-pushed the flavorjones-ruby-3.2-support branch 2 times, most recently from 0894f7c to 7530879 Compare December 20, 2022 16:54
@flavorjones flavorjones added this to the v1.14.0 milestone Dec 20, 2022
@flavorjones flavorjones force-pushed the flavorjones-ruby-3.2-support branch 7 times, most recently from c1f36c3 to 7fdc570 Compare December 24, 2022 03:30
@stevecheckoway
Copy link
Contributor

It's been a long time since I've thought about -flat_namespace, but my recollection from Apple's developer mailing lists is that you really don't want to use a flat namespace. Why is -bundle_loader not the correct solution to this?

Maybe a better question is what is different when compiling ruby with --enable-shared as compared to --disable-shared?

@flavorjones
Copy link
Member Author

flavorjones commented Dec 24, 2022

OK, worth noting that we're jumping through hoops quite a bit with this version to make Darwin work. Here's the story, for posterity:

  • ruby/ruby@50d81bf changes how dynamic symbols are resolved on Darwin in Ruby 3.2
  • ruby/ruby@e7bffe0 and ruby/ruby@5c21cc39 make that work when cross-compiling
  • ruby/ruby@c5eefb7 uses -flat_namespace to allow precompiled extensions to resolve symbols from Ruby regardless of whether they're built with --enable-shared or --disable-shared
  • then a commit in this PR uses -load_hidden when the bundle is linked to make sure libxml/libxslt function calls within the extension don't get resolved to system libraries

I'm concerned that we're now "hiding" the libxml2 and libxslt functions which may break extensions like nokogiri-xmlsec that integrate closely with libxml2 (and like nokogumbo used to do).

@stevecheckoway
Copy link
Contributor

To answer my own question, --enable-shared causes ruby to be built such that the whole thing is a dynamic library.

$ nm ruby
                 U ___stack_chk_fail
                 U ___stack_chk_guard
0000000100000000 T __mh_execute_header
0000000100003ec4 t _main
0000000100003ebc T _ruby_abi_version
                 U _ruby_init
                 U _ruby_init_stack
                 U _ruby_options
                 U _ruby_run_node
                 U _ruby_sysinit
                 U _setlocale

-bundle_loader causes symbols to be looked up in the main executable. But in this case, the symbols are all in a library.

@flavorjones
Copy link
Member Author

@stevecheckoway there's additional context at rake-compiler/rake-compiler-dock#87 if you haven't seen it

@flavorjones
Copy link
Member Author

@stevecheckoway I've revised this PR to remove the -flat_namespace flag to see what happens. This seems to work when I manually test it on arm64-darwin, so let's see what the test suite tells us.

@flavorjones flavorjones force-pushed the flavorjones-ruby-3.2-support branch 3 times, most recently from db55183 to 841f5a5 Compare December 24, 2022 15:51
@stevecheckoway
Copy link
Contributor

@flavorjones I’m starting to wonder if other extensions relying on symbols exported from Nokogiri ever worked correctly. E.g., was Nokogumbo calling the correct libxml2 functions if Nokogiri was compiled with a vendored libxml2 but the system libxml2 was also loaded?

@flavorjones
Copy link
Member Author

OK, removing -flat_namespace is not sufficient to fix this. We're back to the problem of not being able to resolve _rb_cObject

See https://github.com/sparklemotion/nokogiri/actions/runs/3771991414/jobs/6412764421 for the failure, notably this output:

/Users/runner/hostedtoolcache/Ruby/3.2.0-rc1/x64/lib/ruby/gems/3.2.0+3/gems/nokogiri-1.15.test.2022.1224.1603-x86_64-darwin/lib/nokogiri/extension.rb:7:in 'require_relative': dlopen(/Users/runner/hostedtoolcache/Ruby/3.2.0-rc1/x64/lib/ruby/gems/3.2.0+3/gems/nokogiri-1.15.test.2022.1224.1603-x86_64-darwin/lib/nokogiri/3.2/nokogiri.bundle, 0x0009): Symbol not found: (_rb_cObject) (LoadError)`

See https://github.com/sparklemotion/nokogiri/actions/runs/3771991414/jobs/6412682822 for the build, notably this output:

MIKE: LDFLAGS are -L. -pipe  -fstack-protector-strong
MIKE: DLDFLAGS are -pipe  -Wl,-multiply_defined,suppress -Wl,-undefined,dynamic_lookup
MIKE: EXTDLDFLAGS are -bundle_loader '$(BUILTRUBY)' -Wl,-flat_namespace
Removed ["-Wl,-flat_namespace"] from EXTDLDFLAGS
MIKE: EXTDLDFLAGS are now -bundle_loader '$(BUILTRUBY)'

Note that the linker flags used in the makefile at build time are a concatenation of DLDFLAGS and EXTDLDFLAGS, so -Wl,-undefined,dynamic_lookup is retained in the linker flags. Here's what the makefile snippet looks like:

ldflags  = -L. -pipe  -fstack-protector-strong
dldflags = -pipe  -Wl,-multiply_defined,suppress -Wl,-undefined,dynamic_lookup -bundle_loader '$(BUILTRUBY)'
ARCH_FLAG = 
DLDFLAGS = $(ldflags) $(dldflags) $(ARCH_FLAG)

Should I also try to remove -bundle_loader?

@flavorjones
Copy link
Member Author

@stevecheckoway Removing both -bundle_loader and -flat_namespace when we build the bundle seems to work. WDYT, should I ship a release candidate with those options?

@kateinoigakukun @larskanis can you think of a reason why we shouldn't use this approach, at least short-term until we have a more permanent fix?

@stevecheckoway
Copy link
Contributor

I did experiment a bit with two bundles, one of which wants to call a function defined in the other but the same symbol is also defined in dylib linked by the main executable. This is the Nokogumbo situation. It was definitely calling the wrong function.

By switching from Mach-o bundles (MH_BUNDLE) to Mach-o dylibs (MH_DYLIB), I was able to get a dependent extension to link to the main extension. Due to two-level namespaces, this worked to call the right function. I created a branch to demonstrate this. I haven't investigated if ruby requires extensions to be bundles on macOS but if so, this approach isn't going to work and I'm not sure how to allow dependent gems to call Nokogiri's vendored libxml2 functions.

Actually, that's not true, a dependent extension should be able to dlopen() the Nokogiri bundle and use dlsym to call the relevant functions.

@stevecheckoway Removing both -bundle_loader and -flat_namespace when we build the bundle seems to work. WDYT, should I ship a release candidate with those options?

I think that makes sense. It doesn't solve the dependent gems problem, but if we had never solved that, this isn't worse.

@flavorjones
Copy link
Member Author

OK, well I posted a PR at stevecheckoway/bundle_test#1 to demonstrate that we can use -load_hidden if we don't care about breaking the (very few) gems that integrate closely with Nokogiri's C API. So that's an option, too.

At this point, it seems like I'm not going to get to an RC today, anyway, so maybe we can wait for other folks to chime in.

@flavorjones
Copy link
Member Author

I think I have a slight preference for the -load_hidden approach at 1714e77. I think it's simpler to do, easier to reason about, and it's not actively fighting against (removing or clobbering) the ldflags that Ruby 3.2 wants to use.

The -load_hidden approach will prevent tight integration with the Nokogiri, libxml2, libxslt, and libgumbo C APIs, and potentially break downstream gems that rely on this. But right now nokogumbo is EOLed, and the only other gem I know of is a few variations on nokogiri-xmlsec, none of which support Ruby 3.2 right now, anyway. Plus, @stevecheckoway pointed out above that this integration may never have worked right, so maybe it's fine?

(It looks like nokogiri-xmlsec already embraced that brokenness by setting -Wl,-undefined,dynamic_lookup, anyway.)

If we only break that compatibility on Darwin for Ruby 3.2+, it would also give us a chance to do a "scream test" to get to know who's relying on the C APIs, to understand their needs, and perhaps design a better solution with them in mind.

@hsbt
Copy link

hsbt commented Dec 24, 2022

FYI: ruby/ruby@c5eefb7 was reverted for Ruby 3.2 because It broke rbs gem. It's not working with same symbol namespace in used gems.

So, rbs uses _skip and _peek or other common symbols. We concern this case with other gems.

@flavorjones
Copy link
Member Author

@hsbt Thank you for telling us. I agree it's better to drop -flat_namespace.

@claudiug

This comment was marked as off-topic.

@flavorjones

This comment was marked as off-topic.

@claudiug

This comment was marked as off-topic.

@flavorjones
Copy link
Member Author

The two options I know of to resolve symbol resolution on Darwin on Ruby 3.2 are described here: rake-compiler/rake-compiler-dock#92 (comment)

note we're using snapshots from rake-compiler-dock, that will all be
removed once ruby 3.2.0 ships and we can cut a rake-compiler-dock
release.
> fatal: detected dubious ownership

Not sure why this started, but it's annoying
this version will be changing in next rake-compiler-dock
@flavorjones
Copy link
Member Author

I've opted for setting -flat_namespace and using -load_hidden on libxml2 and libxslt. It looks good in CI (modulo the Windows 3.2 tests which are still waiting for setup-ruby support), and in manual testing with both --enable-shared and --disable-shared.

I think I'm going to merge this and then cut a release candidate so people can kick the tires.

@flavorjones
Copy link
Member Author

I've written up a decision record ("ADR") for the symbol resolution workaround being adopted, it's in this PR:

https://github.com/sparklemotion/nokogiri/blob/51fb513ca43cfed37f27faec30237808e3136061/adr/2022-12-darwin-symbol-resolution.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants