Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nokogiri builds incorrectly (exports libxml2's symbols) #1959

Closed
x-yuri opened this issue Dec 20, 2019 · 21 comments
Closed

nokogiri builds incorrectly (exports libxml2's symbols) #1959

x-yuri opened this issue Dec 20, 2019 · 21 comments

Comments

@x-yuri
Copy link

x-yuri commented Dec 20, 2019

Describe the bug

The rumor has it that nokogiri builds incorrectly. From what I can gather it exports (not hides) libxml2's symbols. As a result, if libxml2.so is loaded before nokogiri.so, nokogiri is bound to use the system library, not the one embedded in the binary. And as a resullt it has to check whether that's the case.

The solution is supposedly to add -Wl,--exclude-libs,ALL or -fvisibility=hidden.

To Reproduce

docker-compose.yml:

version: '3'

services:
  ruby:
    build: .
    volumes:
      - ./:/app

Dockerfile:

FROM ruby:2-alpine3.10
RUN apk add build-base imagemagick6-dev
WORKDIR /app

Gemfile:

source 'https://rubygems.org'
gem 'nokogiri', '1.10,7'
gem 'rmagick', '4.0.0'

1.rb:

require 'rmagick'
require 'nokogiri'
$ docker-compose run ruby sh

/app # cat /etc/issue
Welcome to Alpine Linux 3.10
Kernel \r on an \m (\l)

/app # apk info -vv | grep libxml
libxml2-2.9.9-r3 - XML parsing library, version 2

/app # bundle
Fetching gem metadata from https://rubygems.org/.............
Resolving dependencies...
Using bundler 2.1.2
Fetching mini_portile2 2.4.0
Installing mini_portile2 2.4.0
Fetching nokogiri 1.10.7
Installing nokogiri 1.10.7 with native extensions
Fetching rmagick 4.0.0
Installing rmagick 4.0.0 with native extensions
Bundle complete! 2 Gemfile dependencies, 4 gems now installed.
Bundled gems are installed into `/usr/local/bundle`

/app # bundle exec ruby 1.rb
WARNING: Nokogiri was built against LibXML version 2.9.10, but has dynamically loaded 2.9.9

Alternatively,

1.sh:

#!/bin/sh
set -eux
apk add build-base imagemagick6-dev
gem install -v 4.0.0 rmagick
gem install -v 1.10.7 nokogiri
echo '
    require "rmagick"
    require "nokogiri"
' | ruby
$ docker run --rm -it -v $PWD/1.sh:/1.sh ruby:2.6-alpine3.10 ./1.sh
...
+ echo '
    require "rmagick"
    require "nokogiri"
'
+ ruby
WARNING: Nokogiri was built against LibXML version 2.9.10, but has dynamically loaded 2.9.9

Expected behavior

It produces no warnings.

Environment

$ nokogiri -v
# Nokogiri (1.10.7)
    ---
    warnings: []
    nokogiri: 1.10.7
    ruby:
      version: 2.6.5
      platform: x86_64-linux-musl
      description: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux-musl]
      engine: ruby
    libxml:
      binding: extension
      source: packaged
      libxml2_path: "/usr/local/bundle/gems/nokogiri-1.10.7/ports/x86_64-pc-linux-musl/libxml2/2.9.10"
      libxslt_path: "/usr/local/bundle/gems/nokogiri-1.10.7/ports/x86_64-pc-linux-musl/libxslt/1.1.34"
      libxml2_patches:
      - 0001-Revert-Do-not-URI-escape-in-server-side-includes.patch
      - 0002-Remove-script-macro-support.patch
      - 0003-Update-entities-to-remove-handling-of-ssi.patch
      - 0004-libxml2.la-is-in-top_builddir.patch
      libxslt_patches: []
      compiled: 2.9.10
      loaded: 2.9.10
@flavorjones
Copy link
Member

Hi @x-yuri, thanks for reporting this. I'm not sure there's anything we can do without fundamentally changing how we package and ship Nokogiri, this will require some research, and I'm afraid I don't have much time these days to invest in it.

Do you have any ideas on how to approach this in a way that won't break how Nokogiri compiles and installs?

@flavorjones
Copy link
Member

Ah, I see you've made a suggestion -- I'll try that out.

@flavorjones
Copy link
Member

Hi @x-yuri, I'm unable to reproduce what you're seeing, either on my system or using the Dockerfile and docker-compose setup you specify above (though I'll note I had to modify your docker-compose file to execute correctly).

I've created a gist reflecting my attempt to reproduce this: https://gist.github.com/flavorjones/724995b110dcb6123689b5e18ea7bf38

Can you please help me understand what I'm doing differently from you, and how I can reproduce what you're seeing?

@x-yuri
Copy link
Author

x-yuri commented Jan 6, 2020

@flavorjones The reason you weren't able to reproduce the issue is supposedly owing to Alpine Linux 3.11 release. I've updated the original post. Please try again.

The trick to reproduce the issue is to have the system libxml2 of a different version than the one that comes with nokogiri (nokogiri-1.10.7 comes with libxml2-2.9.10). Then requiring rmagick (which is dynamically linked to libxml2) loads the system version. As a result nokogiri is bound to use the loaded system library, not the one it was built against. So, it complains.

Resolving this issue will supposedly eliminate the ability of other gems to force nokogiri to use the system libxml2 (and the need to check the libxml2 version for that matter).

P.S. I'm not a C programmer, but the guy on Stack Overflow says the way it works now is incorrect/wrong.

@flavorjones
Copy link
Member

@stevecheckoway Can you confirm for me that hiding these symbols as suggested will break nokogumbo? My understanding is that nokogumbo calls the exported libxml2 symbols that are compiled into nokogiri.so.

@stevecheckoway
Copy link
Contributor

Nokogumbo is supposed to support three configurations

  1. Nokogiri links against the system libxml2 but there are no headers;
  2. Nokogiri links against the system libxml2 and there are headers;
  3. Nokogiri packages libxml2 itself.

1 is the slow case (and I'm actually not 100% sure I didn't accidentally break it when reworking the build logic in the past). 3 seems to be the normal case. If Nokogiri packages libxml2, then Nokogumbo expects to link against those symbols.

I'm open to changing the way this all happens, but I'd hate to force everything to the slow case by not using libxml2 directly in Nokogumbo.

To explain, Nokogumbo uses the gumbo HTML library to parse the HTML document and then constructs an XML tree. If libxml2 (and its headers) are available at build time, then Nokogumbo constructs the tree directly from the library and only then wraps it in a Ruby object from Nokogiri. If the library (or its headers) are not available, then Nokogumbo is forced to construct the tree by constructing Ruby objects for each node and building the tree, all using Nokogiri's Ruby API.

@flavorjones
Copy link
Member

@stevecheckoway Got it, thanks for the clear answer.

@x-yuri So just to be clear: Nokogiri maintainers must now make a choice between two groups of users - either I make things slightly less awkward for edge cases like the one you mention, or I make things much less performant for users of nokogumbo.

My preference, absent a really compelling reaons, is going to be to optimize for users of nokogumbo, as that's the most common way that HTML5 documents are parsed; and because there's a workaround available for the edge case you mention (which is to reverse the order in which the libraries are required).

I'm open to conversation.

@flavorjones
Copy link
Member

@x-yuri Also, I'm confused - is this problem fixed in alpine 3.11? In which case ... doesn't that mean it's an issue with alpine? As I might have mentioned, I'm unable to reproduce this on Ubuntu systems ... I'm a little confused as this seems like a very edge-y edge case.

@x-yuri
Copy link
Author

x-yuri commented Jan 30, 2020

@flavorjones

I'm confused - is this problem fixed in alpine 3.11?

The issue has to do with neither Docker, nor Alpine Linux. Docker makes it easy to reproduce, Alpine Linux is small in size (less to download, faster to reproduce):

$ docker image ls
...
ruby                       2.6-alpine3.10      284974df8f82        6 days ago          50.9MB
ruby                       2.6-buster          a161c3e3dda8        3 weeks ago         840MB

Anyways, let's take Debian:

1.sh:

#!/usr/bin/env bash
set -eux
dpkg -l | grep libxml2
curl -sS https://raw.githubusercontent.com/sparklemotion/nokogiri/v1.10.7/dependencies.yml | head -n 2
gem install -v 4.0.0 rmagick
gem install -v 1.10.7 nokogiri
echo '
    require "rmagick"
    require "nokogiri"
' | ruby
$ docker run --rm -it -v $PWD/1.sh:/1.sh ruby:2.6-buster ./1.sh
+ dpkg -l
+ grep libxml2
ii  libxml2:amd64                      2.9.4+dfsg1-7+b3            amd64        GNOME XML library
ii  libxml2-dev:amd64                  2.9.4+dfsg1-7+b3            amd64        Development files for
 the GNOME XML library
+ curl -sS https://raw.githubusercontent.com/sparklemotion/nokogiri/v1.10.7/dependencies.yml
+ head -n 2
libxml2:
  version: "2.9.10"
+ gem install -v 4.0.0 rmagick
Fetching rmagick-4.0.0.gem
Building native extensions. This could take a while...
Successfully installed rmagick-4.0.0
1 gem installed
+ gem install -v 1.10.7 nokogiri
Fetching nokogiri-1.10.7.gem
Fetching mini_portile2-2.4.0.gem
Successfully installed mini_portile2-2.4.0
Building native extensions. This could take a while...
Successfully installed nokogiri-1.10.7
2 gems installed
+ echo '
    require "rmagick"
    require "nokogiri"
+ ruby
'
WARNING: Nokogiri was built against LibXML version 2.9.10, but has dynamically loaded 2.9.4

Or Ubuntu.

The trick to reproduce the issue is to have system libxml2 different from the one bundled with nokogiri.

The reason I fell back to Alpine Linux 3.10 is because when I was filing the issue Alpine Linux 3.10 was the latest release. It comes with libxml2-2.9.9. nokogiri-1.10.7 comes with libxml2-2.9.10. This way it reproduces. But when you tried to reproduce the issue Alpine Linux 3.11 had already been released. Which comes with libxml2-2.9.10. This way it doesn't reproduce. So I explicitly required Alpine Linux 3.10 to put it under conditions it was running on my machine. You can very well reproduce it with Alpine Linux 3.11 (libxml2-2.9.10) and nokogiri-1.10.4 (libxml2-2.9.9).

either I make things slightly less awkward for edge cases like the one you mention

So as you probably see that is not an edge case. It can be reproduced anywhere nokogiri can run. Because it is a feature of sorts. For one reason or another nokogiri exports libxml2's symbols. Which makes it vulnerable to "libxml2 spoofing."

It might be an edge case in terms of how often people run into it. That I don't know. But as you can see there are packages (and pretty popular at that) that seem to have nothing to do with xml, but nevertheless depend on libxml2.

the most common way that HTML5 documents are parsed

It's the first time I heard of it, but you may be right.

@stevecheckoway Case 2 is not slow, right? But from what I gather using system libxml2 is not recommended. Then if nokogumbo needs libxml2, why doesn't it use its own copy?

@flavorjones
Copy link
Member

@x-yuri Thanks for explaining a bit more, now I understand.

However, again, there's a simple workaround which is to reverse the order of requires; which means this may not be a compelling enough problem (IMHO) to break nokogumbo.

That said: I've pushed a branch and PR that un-exports the libxml2/libxslt symbols to #1979. I'm not going to merge that unless the nokogumbo project says they won't be affected. If nokogumbo is affected, then we need to have a conversation about this change.

@x-yuri
Copy link
Author

x-yuri commented Feb 2, 2020

However, again, there's a simple workaround...

If you know what causes the issue.

Nokogiri was built against libxml version #{compiled_libxml_version}, but has dynamically loaded #{loaded_libxml_version}

I month or so ago I'd ask, "Why would nokogiri do that (dynamically load some other version)? Should I do something about it? What do I do?" A better message would be:

libxml2-#{loaded_libxml_version} (system version?) was loaded, but Nokogiri was built against libxml2-#{compiled_libxml_version}. Nokogiri is forced to use the loaded version. That might lead to... Make sure to require files that depend on a shared (system?) libxml2 after Nokogiri.

Ideally nokogiri should find the culprit, $LOADED_FEATURES.grep(/\.so$/).select { |p| system('sh', '-c', 'ldd "$1" | grep libxml2', '-', p, [:out, :err] => "/dev/null") }?

But I suggest to wait for @stevecheckoway's answer to, "Does nokogumbo have to use nokogiri's libxml2? Why if so?" For now I have 2 options in mind: 1) make nokogumbo come with its own version of libxml2, 2) create a gem that provides libxml2 to other gems (somehow prefixing the names?).

@stevecheckoway
Copy link
Contributor

stevecheckoway commented Feb 4, 2020

@x-yuri, as I explained above, it's possible to use Nokogumbo without directly using libxml2 by relying on Nokogiri to construct the DOM nodes. E.g., <br> would have Gumbo produce a GumboElement object (not quite correct, but close enough) and then construct a Nokogiri::XML::Element using the Ruby API which in turn constructs an xmlNode.

If you do this, you end up constructing a Ruby object for every node.

To avoid that, Nokogumbo needs to link (at run time) against the same libxml2 that Nokogiri links against.

Both Nokogiri and Nokogumbo need to link at run time against a compatible version of libxml2 that they were compiled against. I'm not sure what binary compatibility guarantees libxml2 makes with respect to version numbers.

@x-yuri
Copy link
Author

x-yuri commented Feb 6, 2020

Nokogumbo needs to link (at run time) against the same libxml2 that Nokogiri links against

Then probably it sort of makes sense for nokogiri to export libxml2's symbols. But isn't there an easy way to on one hand not let other gems force nokogiri into using system libxml2, on the other one let nokogumbo use nokogiri's libxml2? @stevecheckoway Does nokogumbo need all the libxml2 symbols, or a subset would suffice (considering possible needs in the future)?

Is there a way to delegate something to nokogiri so that nokogumbo didn't need access to nokogiri's libxml2? Maybe some sort of batch construction/building?

@x-yuri
Copy link
Author

x-yuri commented Feb 6, 2020

...Alternatively, one can probably change symbols of libxml2 with objcopy before linking it to nokogiri.

@stevecheckoway
Copy link
Contributor

@x-yuri can you sketch out what you have in mind? It's not sufficient to change the symbols in libxml2. Nokogiri and Nokogumbo still need to have prototype information and, in one place for Nokogumbo, the actual structure definition for an xmlNode.

@x-yuri
Copy link
Author

x-yuri commented Feb 7, 2020

I have 2 options in mind:

  1. As far as I understand, when the exported symbols match, the first loaded library wins. For it to not win, we've got to rename the symbols. The symbols that were renamed are no longer subject to the arbitration. objcopy can help with renaming libxml2's symbols. Then yes, we've got to change the header files correspondingly. That can probably be done with sed, or some sort of preprocessor directives, like #define xmlReaderForFile _ng_xmlReaderForFile (in case we prefix with _ng_). And we probably don't need to rename all the symbols, only those that are used.

    Doesn't sound like a good solution, but neither does nokogumbo using nokogiri's libxml2. To be honest, both sound like a hack. So, a supposedly better solution is...

  2. Delegate the job to nokogiri avoiding this:

    If you do this, you end up constructing a Ruby object for every node.

    For that we've got to add some sort of builder to nokogiri. Or so I understand.

@flavorjones
Copy link
Member

@x-yuri I'm thankful that you're involved and are offering potential solutions. But I want to point out that I don't think we agree that this is a problem that needs solving; or rather, it feels lower priority than many other things that need our attention, including better testing, precompiled binaries for more platforms, and all the bugs and features that are blocking v1.11.0.

In practice, libxml2 hasn't made backwards-incompatible changes in the API for many years. Nokogiri tends to work just fine when compiled against one version and loading another version, which is why we emit a warning and don't exit with an error.

Can you help me understand why you think this edge case around library loading should be a priority for us to solve?

@x-yuri
Copy link
Author

x-yuri commented Feb 8, 2020

@flavorjones Not a priority. You can start with improving the message:

Nokogiri was built against libxml version #{compiled_libxml_version}, but has dynamically loaded #{loaded_libxml_version}

Nokogiri doesn't dynamically load anything. Let me take another stab:

libxml2-#{loaded_libxml_version} has been loaded before Nokogiri was required. That forces Nokogiri to use the former, not libxml2-#{compiled_libxml_version} it was built against. That will most likely cause no issues, but to be on the safe side, make sure you require Nokogiri before any gems that depend on libxml2.

Or "...that depend on libxml2, like RMagick." Also, I've just confirmed that the following code seems to find the culprit:

$LOADED_FEATURES.grep(/\.so$/).select { |p| system('sh', '-c', 'ldd "$1" | grep libxml2', '-', p, [:out, :err] => "/dev/null") }

You might want to make Nokogiri suggest the culprit. Or add a link to this issue in the code or in the message for users to be able to find this piece of code, and/or learn more about the issue.

Nokogiri tends to work just fine when compiled against one version and loading another version

I find it a bit contradicting. In the README you make it feel like using the bundled version is really important:

In Nokogiri 1.6.0 and later libxml2 and libxslt are bundled with the gem, but if you want to use the system versions:

  • First, check out the long list of fixes and changes between releases before deciding to use any version older than is bundled with Nokogiri.

@blshkv
Copy link

blshkv commented Aug 17, 2020

Guys, please stop using bundled version. That's the root of the problem.
Push "the long list of fixes" into libxml2 directly. That's the correct way of doing it.

@flavorjones
Copy link
Member

@blshkv Hi! You might be aware that you can choose to use your system libraries if you like, see https://nokogiri.org/tutorials/installing_nokogiri.html#install-with-system-libraries for more information.

If you'd like to have a conversation about vendoring libxml2, please open a new issue, your comment here is off-topic and not helpful.

@flavorjones flavorjones modified the milestone: v1.11.0 Nov 21, 2020
larskanis added a commit to larskanis/ffi that referenced this issue Dec 18, 2020
…g gems

Fiddle can crash when used together with ffi and it's builtin libffi.
This happens because fiddle is linked to system libffi, but ffi is linked to builtin libffi.
Depending on which of these gems is loaded first, they link to the wrong runtime library.

This issue is similar to the issue in nokogiri: sparklemotion/nokogiri#1959

Fixes ffi#835
@flavorjones
Copy link
Member

OK, I've determined that hiding these symbols will break nokogumbo at runtime.

Testing with Nokogiri v1.11.1 and this change applied:

commit 86ac42f (HEAD -> 1959-no-export-libxml2-symbols)
Author: Mike Dalessio <mike.dalessio@gmail.com>
Date:   2020-02-01 16:20:15 -0500

    ensure libxml2/libxslt symbols aren't exported
    
    see #1959 for details

diff --git a/ext/nokogiri/extconf.rb b/ext/nokogiri/extconf.rb
index 0b8daae..ea5d06c 100644
--- a/ext/nokogiri/extconf.rb
+++ b/ext/nokogiri/extconf.rb
@@ -576,6 +576,7 @@ def do_clean
 append_cflags("-Wmissing-noreturn") # good to have no matter what Ruby was compiled with
 append_cflags("-Wno-error=unused-command-line-argument-hard-error-in-future") if darwin?
 # append_cflags(["-Wcast-qual", "-Wwrite-strings"]) # these tend to be noisy, but on occasion useful during development
+append_ldflags("-Wl,--exclude-libs,ALL") # Ensure libxml2/libxslt symbols aren't exported, #1959
 
 # Add SDK-specific include path for macOS and brew versions before v2.2.12 (2020-04-08) [#1851, #1801]
 macos_mojave_sdk_include_path = "/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/libxml2"

Nokogumbo will install (compile, link, etc.) but cannot find the relevant symbols at runtime:

$ ruby -rnokogumbo -r open-uri -e "Nokogiri::HTML5.parse(URI.open('https://google.com/'))"
MIKE: nokogiri init: xmlNewDocNode 0x7f48a31e26d0
ruby: symbol lookup error: /home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogumbo-2.0.4/lib/nokogumbo/nokogumbo.so: undefined symbol: htmlNewDocNoDtD

As such, we can't hide these symbols today without a plan for Nokogumbo.

Thankfully, such a plan has emerged in #2064 which will result in Nokogumbo being merged into Nokogiri, hopefully in v1.12.x. At that point I'd be happy to revisit this issue.

Thanks for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants