Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

line numbers not working as expected #1493

Closed
flavorjones opened this issue Jun 20, 2016 · 3 comments
Closed

line numbers not working as expected #1493

flavorjones opened this issue Jun 20, 2016 · 3 comments

Comments

@flavorjones
Copy link
Member

(reported on nokogiri-talk by @justinthec)

I haven't had time to look into this, but wanted to capture it so that it wouldn't get forgotten.


Hello,

Preface/libxml2

I've come across a bug that popped up once before in libxml2 back in 2008 where XML_TEXT_NODEs would always return 0 as their line number.

Link: https://mail.gnome.org/archives/xml/2008-July/msg00017.html

It was however patched that day and should have been fixed.

Commit: https://git.gnome.org/browse/libxml2/commit/?id=45efd0878aa56efa4291eec2ecdcff66d2f52fc3

Those three lines that he added still exist in the most current version of libxml2 now just under a conditional check for a (ctxt->linenumbers) flag.

Current Day libxml2: https://git.gnome.org/browse/libxml2/tree/SAX2.c?id=4472c3a5a5b516aaf59b89be602fbce52756c3e9#n1905

Nokogiri

This bug is currently present in Nokogiri 1.6.8.

I've been looking through the source in an attempt to find a fix and I'd appreciate some guidance:

Possibly relevant files:

static VALUE line(VALUE self)

static VALUE line(VALUE self)

Any ideas?

Thanks in advance,

Justin

@justinthec
Copy link

Thanks Mike, if you have any ideas but don't have time to try them out I'd be happy to take it on.

The farthest I've gotten so far is xmlGetLineNo() being called, but the source for that function is nowhere to be found, most likely in the compiled source of whichever version of libxml you guys are using.

@Xliff
Copy link

Xliff commented Sep 25, 2017

I have been working on a project that uses libxml2, so in my searches I found this issue. Since this is still marked as open, I hope this is still relevant.

It looks like xmlGetLineNo just returns the results of xmlGetLineNoInternal, which you can find, here:
https://github.com/GNOME/libxml2/blob/master/tree.c#L4593

flavorjones added a commit that referenced this issue Aug 14, 2021
- set BIG_LINES parse option by default which will allow Node#line to return large integers
- allow Node#line= to set large line numbers on text nodes

Fixes #1764, #1493, #1617, #1505, #1003, #533
flavorjones added a commit that referenced this issue Aug 14, 2021
feat(cruby): support line numbers larger than a short

---

**What problem is this PR intended to solve?**

As noted in #1493, #1617, #1505, #1003, and #533, libxml2 has not historically supported line numbers greater than a `short int`. Starting in libxml v2.9.0, setting the parse option `BIG_LINES` would allow tracking line numbers in longer documents.

Specifically this PR makes the following changes:

- set `BIG_LINES` parse option by default which will allow `Node#line` to return large integers
- allow `Node#line=` to set large line numbers on text nodes

Fixes #1764 

**Have you included adequate test coverage?**

Yes!

**Does this change affect the behavior of either the C or the Java implementations?**

JRuby's Xerces-based implementation did not suffer from this particular shortcoming, although its line number functionality is questionable in other ways (see #2177 / b32c875).
@flavorjones
Copy link
Member Author

This will be fixed in v1.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants