Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML5 documents should not require namespaces in CSS selector queries #2403

Merged
merged 13 commits into from Jan 4, 2022

Commits on Jan 3, 2022

  1. Configuration menu
    Copy the full SHA
    685403a View commit details
    Browse the repository at this point in the history
  2. CSS::Node#to_xpath arguments are now required

    This makes it explicit that visitor must be injected.
    
    Although this is technically a breaking change, CSS::Node is
    essentially an internal API and I would be extremely surprised if it
    was being used directly by anyone.
    
    This commit also ensures that CSS::Node, CSS::Parser, and
    CSS::Tokenizer are omitted from API documentation (to reflect their
    status as an internal-only and unstable API).
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    b24efcc View commit details
    Browse the repository at this point in the history
  3. prefactor: allow injection of an XPathVisitor to CSS.xpath_for

    Note that the method signature of internal-only method
    CSS::Parser.xpath_for has changed to prefer required parameters rather
    than an options hash with assumed defaults.
    
    Also note that I've introduced some XPath constants for XPath query
    prefixes, which we should start using.
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    4b6bfa9 View commit details
    Browse the repository at this point in the history
  4. fix: CSS cache includes XPathVisitor configuration

    which prevents incorrect cache results being returned in cases where
    different visitor configurations are used.
    
    We also deprecate XPathVisitorAlwaysUseBuiltins and
    XPathVisitorOptimallyUseBuiltins in favor of a simple XPathVisitor
    with constructor arguments.
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    b3ffffb View commit details
    Browse the repository at this point in the history
  5. prefactor: CSS parser distinguishes attr names from element names

    which will allow us to more easily special-case element names in an
    upcoming commit.
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    e005149 View commit details
    Browse the repository at this point in the history
  6. introduce enum CSS::XPathVisitor::BuiltinsConfig

    and refactor the XPathVisitor#css_class to not rely on aliasing during
    object initialization, which feels funny to me
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    f935e5a View commit details
    Browse the repository at this point in the history
  7. test: CSS integration tests run on XML, HTML4, and HTML5 documents

    we're about to introduce variations in how CSS selectors are
    translated into XPath and I want to ensure we have adequate test coverage.
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    32f9ab0 View commit details
    Browse the repository at this point in the history
  8. fix: CSS searches in HTML5 documents should not require namespaces

    In HTML5, foreign elements have namespaces; but those namespaces
    should not be considered for the purposes of CSS searches.
    
    Unfortunately, this implementation's use of local-name() is ~10x
    slower than using the normal inline element name matching, which
    subsequent commits will explore.
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    85ac956 View commit details
    Browse the repository at this point in the history
  9. doc: document XPathVisitor

    Also tweak :nodoc: directives to avoid accidentally excluding
    important modules (like Nokogiri::CSS)
    
    and avoid including code that's not usually relevant to the user
    API (like Nokogiri::XML:PP, Nokogiri::CSS:Tokenizer, and
    Nokogiri::HTML4::Document::EncodingReader)
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    d030381 View commit details
    Browse the repository at this point in the history
  10. doc: document XML::Notation

    and other small changes to doc strings
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    42068de View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    2096ef9 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    0fd4de4 View commit details
    Browse the repository at this point in the history
  13. feat: support wildcard namespaces in xpath queries

    This is almost as fast as a standard child-axis search, and much
    faster than the builtin or using local-name():
    
      //span
           18.923k (± 8.9%) i/s -     93.906k in   5.010792s
      //*[local-name()='span']
            1.849k (± 2.8%) i/s -      9.261k in   5.011560s
      //*[nokogiri-builtin:local-name-is('span')]
            3.191k (± 2.4%) i/s -     16.150k in   5.064798s
      //*:span
           18.016k (± 4.6%) i/s -     89.900k in   5.003444s
    
      Comparison:
      //span:
          18922.5 i/s
      //*:span:
          18016.5 i/s - same-ish: difference falls within error
      //*[nokogiri-builtin:local-name-is('span')]:
           3190.6 i/s - 5.93x  (± 0.00) slower
      //*[local-name()='span']:
           1849.4 i/s - 10.23x  (± 0.00) slower
    flavorjones committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    d1a710e View commit details
    Browse the repository at this point in the history