Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HtmlParser incorrectly reports end tags (@angular/compiler) #36118

Closed
gregjacobs opened this issue Mar 18, 2020 · 3 comments
Closed

HtmlParser incorrectly reports end tags (@angular/compiler) #36118

gregjacobs opened this issue Mar 18, 2020 · 3 comments

Comments

@gregjacobs
Copy link

🐞 bug report

Affected Package

The issue is caused by package: @angular/compiler

Is this a regression?

Not sure

Description

The Angular HTML parser incorrectly reports end tags for elements.

I'm not sure what this could affect internally (other than incorrect reporting of html source code locations for users), but seems like something that should be fixed to make sure downstream functionality works correctly.

It's also affecting the codemod tooling that I'm building on top of Angular's HTML parser (which is how I discovered the issue).

🔬 Minimal Reproduction

import { HtmlParser } from '@angular/compiler';

const parser = new HtmlParser();
const ast = parser.parse(`<div><input></div>`, 'tmp.html');

console.log(ast);

StackBlitz: https://stackblitz.com/edit/angular-vfni6r?file=src%2Fapp%2Fapp.component.ts

The AST shows the <input> element having an end tag of </div> (via its endSourceSpan), and the <div> element has an endSourceSpan of null (signifying that it has no end tag)

🌍 Your Environment

Angular Version: StackBlitz / Angular v9.0.6

@pkozlowski-opensource pkozlowski-opensource added the area: compiler Issues related to `ngc`, Angular's template compiler label Mar 18, 2020
@ngbot ngbot bot added this to the needsTriage milestone Mar 18, 2020
@petebacondarwin
Copy link
Member

Tracking at FW-2004

@ngbot ngbot bot modified the milestones: needsTriage, Backlog Mar 18, 2020
JoostK added a commit to JoostK/angular that referenced this issue Jul 17, 2020
…elements

HTML is very lenient when it comes to closing elements, so Angular's parser has
rules that specify which elements are implicitly closed when closing a tag.
The parser keeps track of the nesting of tag names using a stack and parsing
a closing tag will pop as many elements off the stack as possible, provided
that the elements can be implicitly closed.

For example, consider the following templates:

- `<div><br></div>`, the `<br>` is implicitly closed when parsing `</div>`,
  because `<br>` is a void element.
- `<div><p></div>`, the `<p>` is implicitly closed when parsing `</div>`,
  as `<p>` is allowed to be closed by the closing of its parent element.
- `<ul><li>A <li>B</ul>`, the first `<li>` is implicitly closed when parsing
  the second `<li>`, whereas the second `<li>` would be implicitly closed when
  parsing the `</ul>`.

In all the cases above the parsed structure would be correct, however the source
span of the closing `</div>` would incorrectly be assigned to the element that
is implicitly closed. The problem was that closing an element would associate
the source span with the element at the top of the stack, however this may not
be the element that is actually being closed if some elements would be
implicitly closed.

This commit fixes the issue by assigning the end source span with the element
on the stack that is actually being closed. Any implicitly closed elements that
are popped off the stack will not be assigned an end source span, as the
implicit closing implies that no ending element is present.

Note that there is a difference between self-closed elements such as `<input/>`
and implicitly closed elements such as `<input>`. The former does have an end
source span (identical to its start source span) whereas the latter does not.

Fixes angular#36118
Resolves FW-2004
AndrewKushnir pushed a commit that referenced this issue Jul 20, 2020
…elements (#38126)

HTML is very lenient when it comes to closing elements, so Angular's parser has
rules that specify which elements are implicitly closed when closing a tag.
The parser keeps track of the nesting of tag names using a stack and parsing
a closing tag will pop as many elements off the stack as possible, provided
that the elements can be implicitly closed.

For example, consider the following templates:

- `<div><br></div>`, the `<br>` is implicitly closed when parsing `</div>`,
  because `<br>` is a void element.
- `<div><p></div>`, the `<p>` is implicitly closed when parsing `</div>`,
  as `<p>` is allowed to be closed by the closing of its parent element.
- `<ul><li>A <li>B</ul>`, the first `<li>` is implicitly closed when parsing
  the second `<li>`, whereas the second `<li>` would be implicitly closed when
  parsing the `</ul>`.

In all the cases above the parsed structure would be correct, however the source
span of the closing `</div>` would incorrectly be assigned to the element that
is implicitly closed. The problem was that closing an element would associate
the source span with the element at the top of the stack, however this may not
be the element that is actually being closed if some elements would be
implicitly closed.

This commit fixes the issue by assigning the end source span with the element
on the stack that is actually being closed. Any implicitly closed elements that
are popped off the stack will not be assigned an end source span, as the
implicit closing implies that no ending element is present.

Note that there is a difference between self-closed elements such as `<input/>`
and implicitly closed elements such as `<input>`. The former does have an end
source span (identical to its start source span) whereas the latter does not.

Fixes #36118
Resolves FW-2004

PR Close #38126
@gregjacobs
Copy link
Author

Thanks @AndrewKushnir, good fix!

@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Aug 21, 2020
profanis pushed a commit to profanis/angular that referenced this issue Sep 5, 2020
…elements (angular#38126)

HTML is very lenient when it comes to closing elements, so Angular's parser has
rules that specify which elements are implicitly closed when closing a tag.
The parser keeps track of the nesting of tag names using a stack and parsing
a closing tag will pop as many elements off the stack as possible, provided
that the elements can be implicitly closed.

For example, consider the following templates:

- `<div><br></div>`, the `<br>` is implicitly closed when parsing `</div>`,
  because `<br>` is a void element.
- `<div><p></div>`, the `<p>` is implicitly closed when parsing `</div>`,
  as `<p>` is allowed to be closed by the closing of its parent element.
- `<ul><li>A <li>B</ul>`, the first `<li>` is implicitly closed when parsing
  the second `<li>`, whereas the second `<li>` would be implicitly closed when
  parsing the `</ul>`.

In all the cases above the parsed structure would be correct, however the source
span of the closing `</div>` would incorrectly be assigned to the element that
is implicitly closed. The problem was that closing an element would associate
the source span with the element at the top of the stack, however this may not
be the element that is actually being closed if some elements would be
implicitly closed.

This commit fixes the issue by assigning the end source span with the element
on the stack that is actually being closed. Any implicitly closed elements that
are popped off the stack will not be assigned an end source span, as the
implicit closing implies that no ending element is present.

Note that there is a difference between self-closed elements such as `<input/>`
and implicitly closed elements such as `<input>`. The former does have an end
source span (identical to its start source span) whereas the latter does not.

Fixes angular#36118
Resolves FW-2004

PR Close angular#38126
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants