Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge nokogumbo history #2217

Merged
merged 484 commits into from Apr 8, 2021
Merged

merge nokogumbo history #2217

merged 484 commits into from Apr 8, 2021

Conversation

flavorjones
Copy link
Member

What problem is this PR intended to solve?

This is one step of many to merge Nokogumbo into Nokogiri (see Epic: merge Nokogumbo into Nokogiri · Issue #2204 · sparklemotion/nokogiri).

  • Commit history for Nokogumbo is preserved in the Nokogiri repository
  • Nokogumbo contributors are added to the Nokogiri gemspec, README, and copyright declarations
  • All nokogumbo files should mention they are originally licensed under Apache 2.0 (an interpretation of APL2.0 clause 4.c) and mention that they have been changed (clause 4.b)

craigbarnes and others added 30 commits August 22, 2018 16:01
To ensure sufficient bit width for the maximum value.
Instead of pre-processing the string to lower case in gumbo_tagn_enum()
before the hash lookup.
- Export luaopen_gumbo_parse on WIN32
- Ignore `__STDC_VERSION__` on WIN32
- Add lib/visualc/include/strings.h for compiling with MSVC
It's not used in the codebase at all and was never in the public API.

This also allows removing the "string_piece.h" header, since all other
functions from "string_piece.c" are declared in "gumbo.h".
It's not used anywhere in this codebase and the indirection just
hinders the compiler optimizer for no good reason.
The one define it contained can live in macros.h.
Instead of having 4 copies of this exact same static data, just use a
single, global declaration.
The PRINTF macro (defined in lib/macros.h) expands to a GCC/Clang
function attribute that allows the compiler to show warnings for
mismatched printf specifers.
rubys and others added 25 commits November 27, 2020 21:43
fix: support mageia when system libraries are installed
also:
- Appveyor:
  - move to Visual Studio 2019 for Ruby 2.7 and 3.0
  - remove Ruby 2.4 and earlier
- Travis: drop entirely
ci: add Ruby 3 to Github Actions and Appveyor
which are present starting in Nokogiri v1.11.0.rc4.

See #2145 for more information.

With this change, here's what compilation looks like when Nokogiri is
built with libxml2:

> /home/flavorjones/.rvm/rubies/ruby-2.7.2/bin/ruby -I. ../../../../ext/nokogumbo/extconf.rb
> checking for whether -I/home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogiri-1.11.0.rc3/ext/nokogiri is accepted as CFLAGS... yes
> checking for whether -I/home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogiri-1.11.0.rc3/ext/nokogiri/include is accepted as CFLAGS... yes
> checking for whether -I/home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogiri-1.11.0.rc3/ext/nokogiri/include/libxml2 is accepted as CFLAGS... yes
> checking for libxml/tree.h... yes
> checking for nokogiri.h... yes
> creating Makefile

and here's what compilation looks like when Nokogiri is _not_ built with
libxml2:

> checking for whether -I/home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogiri-1.11.0.rc3/ext/nokogiri is accepted as CFLAGS... yes
> checking for libxml/tree.h... no
> checking for nokogiri.h... no
> checking for xmlNewDoc() in -lxml2... yes
> checking for nokogiri.h in /home/flavorjones/.rvm/gems/ruby-2.7.2/gems/nokogiri-1.11.0.rc3/ext/nokogiri... yes
> creating Makefile

In a future update, once we've pinned the Nokogiri dependency to "~>
1.11", we should be able to remove the stanza that looks at
`libxml2_path`.
This is necessary on Windows where unresolved symbols aren't
allowed. We also limit compatibility with Nokogiri's precompiled libraries
to Nokogiri >= 1.11.2 on Windows for this reason.

Related to:
- #2145
- #2167
- #2202
…precompiled-native-nokogiri-gems

update extconf.rb to use Nokogiri's CPPFLAGS
Closes #170

A future version of Nokogiri will provide Nokogumbo's API (see
#2204). This change
will allow Nokogumbo to detect whether Nokogiri provides the HTML5 API
and whether to use Nokogiri's implementation or Nokogumbo's
implementation.

Some contractual assumptions I'm making about Nokogiri:

- Nokogiri will faithfully reproduce the `::Nokogiri::HTML5` singleton
  method, module, and namespace (including classes
  `Nokogiri::HTML5::Node`, `Nokogiri::HTML5::Document`, and
  `Nokogiri::HTML5::DocumentFragment`)

- Nokogiri will not provide a `::Nokogumbo` module/namespace, but will
  provide a similar `::Nokogiri::Gumbo` module which will provide the
  same public API as `::Nokogumbo`.

This change checks for the existence of `Nokogiri::HTML5`,
`Nokogiri::Gumbo`, and an expected singleton method on each. We could
do a more- or less-thorough check here.

This change also provides an "escape hatch" using an environment
variable `NOKOGUMBO_IGNORE_NOKOGIRI_HTML5` which can be set to force
Nokogumbo to use its own implementation. This escape hatch might be
unnecessary, but this change is invasive enough to make me want to be
cautious.

Nokogumbo will emit a single warning message at `require`-time when it
is uses Nokogiri's implementation. This message points users to
#2205 which will
explain what's going on and help people migrate their
applications (but is an empty placeholder right now).
…y-defined

feat: Nokogumbo detects Nokogiri's HTML5 API
in which case, then we verify that we can resolve libxml symbols.

Related to e0db2f7 which checked symbol resolution on both Linux
and Windows; but it fails (and is unnecessary) on Linux, leading to
seeing this at installation:

> checking for xmlNewDoc() in libxml/tree.h... no
…tection

fix: only check for the header unless Nokogiri provides LDFLAGS
and remove the Travis badge
- nokogiri.gemspec
- README.md
- LICENSE.md
Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 50675 lines exceeds the maximum allowed for the inline comments feature.

@codeclimate
Copy link

codeclimate bot commented Apr 7, 2021

Code Climate has analyzed commit c1a3d67 and detected 18 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 18

The test coverage on the diff in this pull request is 100.0% (80% is the threshold).

This pull request will bring the total coverage in the repository to 94.0% (0.0% change).

View more on Code Climate.

@flavorjones flavorjones merged commit 8d96a4a into main Apr 8, 2021
@flavorjones flavorjones deleted the 2204-merge-nokogumbo branch April 8, 2021 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet