Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux binary gems #1571

Merged
merged 14 commits into from Feb 2, 2020
Merged

Conversation

larskanis
Copy link
Member

@larskanis larskanis commented Dec 30, 2016

This adds cross build task for Linux native gems. This obviously reduces install time dramatically on Linux.

The resulting nokogiri.so files are static linked to libz, libxml2 and libxslt. Linking to libiconv is not necessary, since it is part of libc on Linux. The resulting files are disassembled per objdump tool and it is verified, that all shared library dependencies are as expected and that the required library versions are met. Here is the check for glibc. These checks should make sure, that we don't publish non-working binaries.

Current GLIBC requirement of the nokogiri.so is version 2.17. This means that distros before 2013 will be incompatible. This is due to the use of clock_gettime() in libxslt. We could avoid this function to lower GLIBC requirement to 2.14, if this is wished.

@larskanis
Copy link
Member Author

Rebased to master branch.

@larskanis
Copy link
Member Author

Once again rebased to master. Just in case someone would like to try out the resulting gems - they can be downloaded here!

Or alternatively all the binary gem versions can be built locally per rake gem:native.

The current GLIBC requirement of the nokogiri.so is version 2.17.
This means that distros before 2013 will be incompatible.

No need for libiconv on linux - it's part of glibc.
@larskanis
Copy link
Member Author

Updated this PR to nokogiri-1.8.4. Downloads are here: https://github.com/larskanis/nokogiri/releases/tag/v1.8.4

@flavorjones flavorjones added this to the v1.11.0 milestone Jan 6, 2019
@flavorjones
Copy link
Member

@larskanis Sorry I've not paid attention to this, but I'd like to try to get this into v1.11. Will attempt to resolve conflicts now ...

@flavorjones
Copy link
Member

$ time gem install --local gems/nokogiri-1.10.0-x86_64-linux.gem --no-document
Successfully installed nokogiri-1.10.0-x86_64-linux
1 gem installed

real	0m0.761s
user	0m0.651s
sys	0m0.112s

amazement

@flavorjones
Copy link
Member

@larskanis When I add these lines to .cross_rubies:

diff --git a/.cross_rubies b/.cross_rubies
index c8eff09..2db209c 100644
--- a/.cross_rubies
+++ b/.cross_rubies
@@ -1,5 +1,7 @@
 2.6.0:i686-w64-mingw32
 2.6.0:x86_64-w64-mingw32
+2.6.0:i686-linux-gnu
+2.6.0:x86_64-linux-gnu
 2.5.0:i686-w64-mingw32
 2.5.0:x86_64-w64-mingw32
 2.5.0:i686-linux-gnu

I get the following error:

unexpected so imports ["libm.so.6", "libc.so.6"] in tmp/x86_64-linux/nokogiri/2.6.0/nokogiri.so

and so I think the right thing to do here is:

a/Rakefile b/Rakefile
index 04d005f..2d2e790 100644
--- a/Rakefile
+++ b/Rakefile
@@ -104,7 +104,10 @@ CrossRuby = Struct.new(:version, :host) {
       when /linux/
       [
         'libm.so.6',
-        'libpthread.so.0',
+        *(case
+          when ver < '2.6.0'
+            'libpthread.so.0'
+          end),
         'libc.so.6',
       ]
     end

but I'd like your thoughts because I'm not sure I understand why libpthread isn't needed when compiling against Ruby 2.6.0.

@flavorjones
Copy link
Member

Please note I've pushed a branch 1571-linux-binary-gems that I hope we can collaborate on (I don't think I can push to your fork).

@flavorjones
Copy link
Member

flavorjones commented Jan 13, 2019

Some other tasks I want to do:

  • zlib is being compiled in because extconf.rb assumes that "cross build" means "for windows". we should omit zlib for linux native builds.
  • VersionInfo and nokogiri -v should also carry around the information that the gem was precompiled.

@flavorjones
Copy link
Member

@larskanis Oh, looks like you're intentionally compiling in zlib, can you help me understand why that's necessary?

@larskanis
Copy link
Member Author

larskanis commented Jan 13, 2019

When introducing binary gems on Linux, we should be aware that it isn't without risk, since it is a new and additional install option, which is used per default - not as an opt-in. We should therefore properly test the pre-releases and invite users to do the same. There are two issues I know about:

  1. Bundler isn't clever at selection of unsupported ruby versions. See Add two specs for choosing platform specific gems rubygems/bundler#6247 . This requires users to explicit opt-out of binary gems, when they want to run nokogiri on ruby-trunk.

  2. There is a known issue on Alpine Linux: Cross building doesn't work for alpine3.7 rake-compiler/rake-compiler-dock#20 . I did a manual test on the alpine docker image and it reproducible fails at Nokogiri::XML::Reader#namespaces somewhere within the tests: https://gist.github.com/larskanis/94920901e6466c9383360ee1346f9f34 . I didn't analyze the reason for the segfault so far, but I think we should before the release.

Oh, looks like you're intentionally compiling in zlib, can you help me understand why that's necessary?

Libz isn't installed on every system, especially not within minimal environments like docker images. Including libz makes sure that the user doesn't has to fulfill additional install dependencies and we don't have to deal with "library not found" issues.

unexpected so imports ["libm.so.6", "libc.so.6"] in tmp/x86_64-linux/nokogiri/2.6.0/nokogiri.so

I'm pretty sure this is due to the library restructuring required for the MJIT introduction in ruby-2.6. The intention of the test is mostly to notify any additional libraries which would be required at runtime.

@flavorjones
Copy link
Member

@larskanis Thanks for all the words of wisdom!

I'm hoping to mitigate some of the risk by making the binary linux builds only available in a prerelease, and setting some expectations around it being "beta" software.

The bundler issue is challenging. As is the alpine issue. I'll look into the alpine bug.

@tenderlove
Copy link
Member

  1. Bundler isn't clever at selection of unsupported ruby versions.

Blech. I've been hit by this many times, but I still support this PR.

What if we distributed a fork of libxml2 that could be installed via apt / brew / rpm etc? That might help with the compile times. Apparently some Linux distros are applying the same patch we are (though I don't have any direct links). Distributing a fork might help out people besides Nokogiri users.

@flavorjones
Copy link
Member

What if we distributed a fork of libxml2

Interesting idea, can we start a new issue to have a conversation about the pros and cons?

@larskanis
Copy link
Member Author

I just rebased the 1571-linux-binary-gems branch to latest master. No other changes so far.

To updates to rake-compiler-dock-1.1.0, which does stripping of debug symbols per default.
It is not necessary to conditionally set cross_* options.
They are just ignored for native builds.
It is also wrong, since mingw availablility doesn't tell anything about cross builds for other platforms.
This also enables parallel builds for all 4 targets.
Run "rake gem:native -j1" to disable parallel builds.
@larskanis
Copy link
Member Author

Issue 1 in the above comment will probably finally be fixed by the next bundler release. See rubygems/bundler#7522 . This way bundler will select the platform=ruby gem automatically, if fat binary gem doesn't fit to the ruby release.

I think we'll probably also be able to fix / work around issue 2, so that I hope we'll be able to ship binary releases some day.

They were not triggered due to the namespaced definition.
The generic "gem" task triggers the platform ruby gem in addition to each target.
Since the targets are started simultanously, this builds 4 times the same file.
That sometimes leads to build failures.
It can be avoided by generating only the specific target gem.
The previous task was also triggered in "rake compile" on the native platform.
Since the library dependencies are different then, compared to the cross build, it failed in this case.
Binding to a task that is only invoked in cross builds solves this issue.

It is the better choice in any case, since the new task is triggered at every cross build, even on re-runs.
@codeclimate
Copy link

codeclimate bot commented Jan 5, 2020

Code Climate has analyzed commit 17305ba and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (80% is the threshold).

This pull request will bring the total coverage in the repository to 46.3%.

View more on Code Climate.

@flavorjones
Copy link
Member

Reminder to self: in #1961 @larskanis and I agreed to merge this into master to avoid having to maintain a branch any longer. We'll have a separate discussion about how to then start shipping the linux binaries.

@flavorjones
Copy link
Member

Just resolved conflicts, letting CI run once more before merging.

@flavorjones
Copy link
Member

Oh - I also want to cherry-pick this commit onto this branch: da47f21

@flavorjones
Copy link
Member

Merging now. I'll figure out how to add back the VersionInfo metadata after that.

Also, I've opeend a new issue, #1983, to drive conversation about how to release precompiled libraries for linux.

@flavorjones flavorjones merged commit 2aab78a into sparklemotion:master Feb 2, 2020
@larskanis
Copy link
Member Author

@flavorjones Thanks for merging! Please also have a look at #1963, since this PR (and binary gems in general) don't work on master branch, unless #1963 is merged.

flavorjones added a commit that referenced this pull request Feb 2, 2020
because Lars has overhauled how we build fat binary gems in #1571
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants