-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Normalize line breaks according to spec #307
Conversation
4801a70
to
02de7af
Compare
02de7af
to
d2fb058
Compare
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA). > > To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character. Where `#xD` == `\r` and `#xA` == `\n`, so ` \r\n ` => ` \n ` ` \n\r ` => ` \n\n ` ` \n ` => ` \n ` ` \r ` => ` \n ` BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends fixes #303
@brodybits I would like to land this one next, to avoid rebasing #291 over and over. |
'#x20#xa#xa', | ||
], | ||
])( | ||
'should normalize "\\r" not followed by "\\n" %s', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: should this be tested in attribute as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can not easily, since the the algorithm in the other PR would replace it with a space.
But since we are replacing all occurrences in one go before parsing, I don't think it matters to much.
@@ -29,3 +29,57 @@ describe('errorHandle', () => { | |||
expect({ actual, ...errors }).toMatchSnapshot() | |||
}) | |||
}) | |||
|
|||
describe('whitespace', () => { | |||
const whitespaceToHex = (str) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: could this be a more general-purpose reusable function?
Or am I violating this by suggesting a too-hasty abstraction: https://kentcdodds.com/blog/aha-programming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking about it. I think as soon as it is needed elsewhere we will find a proper place for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
I think it would be nice to take care of the minor nit, don't think it needs to be blocking though.
according to XML1.1 spec. Fixes #49 Since #307 only implemented XML 1.0 spec https://www.w3.org/TR/xml/#sec-line-ends https://www.w3.org/TR/xml11/#sec-line-ends
according to XML1.1 spec. Fixes #49 Since #307 only implemented XML 1.0 spec https://www.w3.org/TR/xml/#sec-line-ends https://www.w3.org/TR/xml11/#sec-line-ends
according to XML1.1 spec. Fixes #49 Since #307 only implemented XML 1.0 spec https://www.w3.org/TR/xml/#sec-line-ends https://www.w3.org/TR/xml11/#sec-line-ends
For [0]. Any usage of @xmldom/xmldom >= 0.8.0 will normalize these. See [1] and [2]. The current xml-encryption (2.0.0) does not do this normalization, but will in 2.0.1 [3]. It's technically within the path of xmlenc.decrypt() [4], but this follows how assertions have been handled (not handling non-normalized whitespace). [0] https://github.com/Clever/saml2/blob/6da3e9c39c326a2f6793bb87c6d12c9ab4446585/lib/saml2.coffee#L242-L245 [1] xmldom/xmldom#307 [2] xmldom/xmldom#314 [3] auth0/node-xml-encryption#101 [4] https://github.com/auth0/node-xml-encryption/blob/291f3f10d5d1d571a3b6da2d411aa323398f5650/lib/xmlenc.js#L185
For [0]. Any usage of @xmldom/xmldom >= 0.8.0 will normalize these, see [1] and [2]. The current xml-encryption (2.0.0) does not do this normalization, but will in 2.0.1 [3]. It's technically within the path of xmlenc.decrypt() [4], but this follows how assertions have been handled (not handling non-normalized whitespace). For xml-crypto, this was changed in 3.0.0 with [5]. [0] https://github.com/Clever/saml2/blob/6da3e9c39c326a2f6793bb87c6d12c9ab4446585/lib/saml2.coffee#L242-L245 [1] xmldom/xmldom#307 [2] xmldom/xmldom#314 [3] auth0/node-xml-encryption#101 [4] https://github.com/auth0/node-xml-encryption/blob/291f3f10d5d1d571a3b6da2d411aa323398f5650/lib/xmlenc.js#L185 [5] node-saml/xml-crypto#261
For [0]. Any usage of @xmldom/xmldom >= 0.8.0 will normalize these, see [1] and [2]. The current xml-encryption (2.0.0) does not do this normalization, but will in 2.0.1 [3]. It's technically within the path of xmlenc.decrypt() [4], but this follows how assertions have been handled (not handling non-normalized whitespace). For xml-crypto, this was changed in 3.0.0 with [5]. [0] https://github.com/Clever/saml2/blob/6da3e9c39c326a2f6793bb87c6d12c9ab4446585/lib/saml2.coffee#L242-L245 [1] xmldom/xmldom#307 [2] xmldom/xmldom#314 [3] auth0/node-xml-encryption#101 [4] https://github.com/auth0/node-xml-encryption/blob/291f3f10d5d1d571a3b6da2d411aa323398f5650/lib/xmlenc.js#L185 [5] node-saml/xml-crypto#261
For [0]. Any usage of @xmldom/xmldom >= 0.8.0 will normalize these, see [1] and [2]. The current xml-encryption (2.0.0) does not do this normalization, but will in 2.0.1 [3]. It's technically within the path of xmlenc.decrypt() [4], but this follows how assertions have been handled (not handling non-normalized whitespace). For xml-crypto, this was changed in 3.0.0 with [5]. [0] https://github.com/Clever/saml2/blob/6da3e9c39c326a2f6793bb87c6d12c9ab4446585/lib/saml2.coffee#L242-L245 [1] xmldom/xmldom#307 [2] xmldom/xmldom#314 [3] auth0/node-xml-encryption#101 [4] https://github.com/auth0/node-xml-encryption/blob/291f3f10d5d1d571a3b6da2d411aa323398f5650/lib/xmlenc.js#L185 [5] node-saml/xml-crypto#261
add semgrep support uodate xmldom breaking changes in xmldom: xmldom/xmldom#310 xmldom/xmldom#314 xmldom/xmldom#307
add semgrep support uodate xmldom breaking changes in xmldom: xmldom/xmldom#310 xmldom/xmldom#314 xmldom/xmldom#307
Where
#xD
==\r
and#xA
==\n
, so\r\n
=>\n
\n\r
=>\n\n
\n
=>\n
\r
=>\n
BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends
fixes #303