Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all valid attribute names can be parsed #252

Open
karfau opened this issue Jun 18, 2021 · 7 comments
Open

Not all valid attribute names can be parsed #252

karfau opened this issue Jun 18, 2021 · 7 comments
Labels
bug Something isn't working needs investigation Information is missing and needs to be researched spec:HTML spec:Namespaces in XML https://www.w3.org/TR/REC-xml-names/ and https://www.w3.org/TR/xml-names11/ spec:XML https://www.w3.org/TR/xml11/
Milestone

Comments

@karfau
Copy link
Member

karfau commented Jun 18, 2021

This is a continuation of jindw/xmldom#267 which already contains an elaborate discussion.

Two examples that are used in vue, are attributes starting with : or @.

According to the XML 1.1 spec https://www.w3.org/TR/xml11/#sec-common-syn:

The Namespaces in XML Recommendation [XML Names] assigns a meaning to names containing colon characters. Therefore, authors should not use the colon in XML names except for namespace purposes, but XML processors must accept the colon as a name character.

But according to the latest HTML spec, in which namespaces are not something to consider, the definition of what is allowed in attribute names is different: https://html.spec.whatwg.org/multipage/syntax.html#attributes-2

@karfau karfau added this to the after next release milestone Jun 18, 2021
@karfau karfau added bug Something isn't working spec:XML https://www.w3.org/TR/xml11/ spec:Namespaces in XML https://www.w3.org/TR/REC-xml-names/ and https://www.w3.org/TR/xml-names11/ labels Jun 18, 2021
@karfau karfau changed the title Not all valid HTML attribute names can be parsed Not all valid attribute names can be parsed Aug 21, 2021
@karfau karfau added the needs investigation Information is missing and needs to be researched label Aug 21, 2021
@karfau
Copy link
Member Author

karfau commented Aug 21, 2021

We will need to understand how this would impact the current parsing of attributes with a namespace prefix. Nothing should change when parsing XML

We also need to decide whether there should be a different behavior for XML vs HTML.
According to the next comment, there needs to be different behavior, so this is blocked by the upcoming solution for #203 which is why this issue is not considered a "next release priority" before 203 has been resolved and released.

Contributions to resolve this are of course welcome, either by discussing details here or (after the resolution of 203!) by providing a PR including tests.

@marrus-sh
Copy link

So, to add additional context to this, the mentioned Namespaces in XML Recommendation introduces a new concept known as “namespace-well-formed” which requires element and attribute names to match the QName production, and other names to match the NCName production, from that specification XML Names. QName notably forbids leading colons in attribute names, so while :foo is a well‐formed XML Name, it does not produce namespace‐well‐formed documents.

Because XML parsing for the DOM is namespace‐aware, documents need to be namespace‐well‐formed in addition to just being ordinarily well‐formed. So it should produce an error when parsing as XML.

HTML’s rules have been made considerably more lax in order to support data-* attributes and HTML custom elements. Importantly, it is entirely possible to write an HTML custom element with a :foo attribute. So ideally you would have different behaviour here.

@karfau
Copy link
Member Author

karfau commented Aug 23, 2021

@marrus-sh Thank you, that was very helpful. I updated my last comment accordingly.

@Kevin44
Copy link

Kevin44 commented Aug 30, 2021

It also appears that there is a problem with the attribute "constructor" also. Here is an example element with that attribute that it is failing on:

<java:new doc:name="Get date instance" doc:id="f5784e1d-4a0d-428f-9833-9f33694bbf10" class="java.util.Date" constructor="Date()" target="dateVar"/>

@marrus-sh
Copy link

This appears to be because the attributes are stored on an ordinary empty object {}, and attribute presence is checked using the in operator, which walks the prototype chain. "constructor" in {} returns true, because constructor is a property on Object.prototype. (A similar problem should exists with any other property defined on Object.prototype, like toString or valueOf.)

To fix this, one of the following approaches can be taken:

  • The attributes object can be created without a prototype (Object.create(null))
  • Attributes can be stored in a Map instead of an Object
  • Object.prototype.hasOwnProperty() can be used instead of in to check for property presence

@karfau
Copy link
Member Author

karfau commented Aug 31, 2021

Interesting finding and thx a lot for tge investigation.
The check using in was added very recently, so I would prefer to switch to hasOwnProperty.

This also means the issue about constructor is different from the one in the description of this issue, so we should split it out.
PRs welcome

dsimpsonOMF added a commit to dsimpsonOMF/xmldom that referenced this issue Sep 1, 2021
As noted in issue xmldom#252 some attributes conflict with prototype fields (such as an attribute `constructor`) thus we must remove the `in` operator and replace with `Object.prototype.hasOwnProperty()`.
@karfau
Copy link
Member Author

karfau commented Sep 2, 2021

FYI: The issue regarding attribute names conflicting with the prototype chain was solved and released as 0.7.4

@karfau karfau modified the milestone: before 1.0.0 Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs investigation Information is missing and needs to be researched spec:HTML spec:Namespaces in XML https://www.w3.org/TR/REC-xml-names/ and https://www.w3.org/TR/xml-names11/ spec:XML https://www.w3.org/TR/xml11/
Projects
Status: Has Priority
Development

No branches or pull requests

3 participants