Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bad parsing of encoded angle brackets #325

Closed
thisgeek opened this issue Jan 7, 2016 · 6 comments
Closed

Possible bad parsing of encoded angle brackets #325

thisgeek opened this issue Jan 7, 2016 · 6 comments
Labels

Comments

@thisgeek
Copy link

thisgeek commented Jan 7, 2016

This text in Angular's $sce documentation (version 1.4.7):

Note: When enabled (the default), IE<11 in quirks mode is not supported. In this mode, IE<11 allow one to execute arbitrary javascript by the use of the expression() syntax. Refer to learn more about them.

Does not match this text in the devdocs $sce documentation:

Note: When enabled (the default), IE to learn more about them.

My guess is that the web scraper misinterpreted the meaning of the encoded open angle bracket in "IE<11" and what looks like a bogus tag in the source HTML. Here's what angular's HTML looks like:

<p>Note:  When enabled (the default), IE&lt;11 in quirks mode is not supported.  In this mode, IE&lt;11 allow
one to execute arbitrary javascript by the use of the expression() syntax.  Refer
<http: blogs.msdn.com="" b="" ie="" archive="" 2008="" 10="" 16="" ending-expressions.aspx=""> to learn more about them.
You can ensure your document is in standards mode and not quirks mode by adding <code><span class="dec">&lt;!doctype html&gt;</span></code>
to the top of your HTML document.</http:></p>

The "bogus" tag I mentioned is <http: blogs.msdn.com="" b="" ie="" archive="" 2008="" 10="" 16="" ending-expressions.aspx="">. It looks like a mangled link. (See the source.)

Here is what the HTML from the currently deployed dev docs looks like:

<p>Note: When enabled (the default), IE to learn more about them. You can ensure your document is in standards mode and not quirks mode by adding <code>&lt;!doctype html&gt;</code> to the top of your HTML document.</p>

I bet the scraper probably should not interpret &lt; as the start of an HTML tag. Maybe it's an upstream bug.

Edit: fixed incorrect documention links.

@thisgeek
Copy link
Author

thisgeek commented Jan 8, 2016

I looked over some of the open Nokogiri issues. Nothing jumped out at me, but I am new to the library. For posterity, it looked to me like sparklemotion/nokogiri/issues/1406 and sparklemotion/nokogiri/issues/1294 might be related.

You, dear reader, would find a place in my heart for helping make a test case that shows whether Nokogiri has a bug that would explain the behavior reported above. I will hazard a guess that the version used to produce the currently deployed devdocs.io is 1.6.7.rc3 according to the latest Gemfile.

@Thibaut
Copy link
Member

Thibaut commented Jan 9, 2016

Here's what angular's HTML looks like:

Actually Angular's HTML is at fault, the < are not escaped:
https://docs.angularjs.org/partials/api/ng/service/$sce.html

49

Browsers handle/repair this correctly but Nokogiri/libxml2 doesn't.

Would you mind sending a PR to Angular to escape the two IE<11 please?

@thisgeek
Copy link
Author

@Thibaut You're correct. I must've gotten my samples mixed up.

I'll open a fix in angular. Thank you.

@Thibaut
Copy link
Member

Thibaut commented Jan 18, 2016

Closing as this isn't an issue with DevDocs.

@Thibaut Thibaut closed this as completed Jan 18, 2016
@thisgeek
Copy link
Author

For posterity:

I traced the source of the issue to a probable bug in the marked library, on which angular's documentation build task depends to parse markdown. In my view, the fault is not due to a failure to escape the angle brackets manually in the source documentation. The markdown dingus, for example, when given the same input escapes the angle brackets as I expect. I'll look into writing a fix to marked.

@thisgeek
Copy link
Author

Patched markedjs/marked#814

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants