Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xmldom is not normalizing line endings before parsing #303

Closed
karfau opened this issue Aug 26, 2021 · 0 comments · Fixed by #307
Closed

xmldom is not normalizing line endings before parsing #303

karfau opened this issue Aug 26, 2021 · 0 comments · Fixed by #307
Labels
breaking change Some thing that requires a version bump due to breaking changes bug Something isn't working spec:XML https://www.w3.org/TR/xml11/
Milestone

Comments

@karfau
Copy link
Member

karfau commented Aug 26, 2021

Specs: https://www.w3.org/TR/xml/#sec-line-ends

2.11 End-of-Line Handling
XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).

To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character

@karfau karfau changed the title xmldom is not normalizing line endings on the input xmldom is not normalizing line endings in input Aug 26, 2021
@karfau karfau changed the title xmldom is not normalizing line endings in input xmldom is not normalizing line endings before parsing Aug 26, 2021
@karfau karfau added bug Something isn't working breaking change Some thing that requires a version bump due to breaking changes spec:XML https://www.w3.org/TR/xml11/ labels Aug 26, 2021
karfau added a commit to karfau/xmldom that referenced this issue Aug 26, 2021
BREAKING CHANGE: Combined line breaks (`\n\r`) are no longer preserved but converted to `\n` before parsing takes place.

fixes xmldom#303
https://www.w3.org/TR/xml/#sec-line-ends
karfau added a commit to karfau/xmldom that referenced this issue Aug 26, 2021
BREAKING CHANGE: Combined line breaks (`\n\r`) are no longer preserved but converted to `\n` before parsing takes place.

fixes xmldom#303
https://www.w3.org/TR/xml/#sec-line-ends
karfau added a commit that referenced this issue Aug 28, 2021
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
>
> To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Where `#xD` == `\r` and `#xA` == `\n`, so
` \r\n ` => ` \n `
` \n\r ` => ` \n\n `
` \n  ` => ` \n `
` \r  ` => ` \n `

BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends

fixes #303
karfau added a commit that referenced this issue Aug 28, 2021
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
>
> To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Where `#xD` == `\r` and `#xA` == `\n`, so
` \r\n ` => ` \n `
` \n\r ` => ` \n\n `
` \n  ` => ` \n `
` \r  ` => ` \n `

BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends

fixes #303
karfau added a commit that referenced this issue Aug 28, 2021
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
>
> To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Where `#xD` == `\r` and `#xA` == `\n`, so
` \r\n ` => ` \n `
` \n\r ` => ` \n\n `
` \n  ` => ` \n `
` \r  ` => ` \n `

BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends

fixes #303
karfau added a commit that referenced this issue Aug 28, 2021
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
>
> To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Where `#xD` == `\r` and `#xA` == `\n`, so
` \r\n ` => ` \n `
` \n\r ` => ` \n\n `
` \n  ` => ` \n `
` \r  ` => ` \n `

BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends

fixes #303
karfau added a commit that referenced this issue Aug 28, 2021
> XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
>
> To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Where `#xD` == `\r` and `#xA` == `\n`, so
` \r\n ` => ` \n `
` \n\r ` => ` \n\n `
` \n  ` => ` \n `
` \r  ` => ` \n `

BREAKING CHANGE: Certain combination of line break characters are normalized before parsing takes place and will no longer be preserved. For details see https://www.w3.org/TR/xml/#sec-line-ends

fixes #303
@karfau karfau added this to the 0.8.0 milestone Aug 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Some thing that requires a version bump due to breaking changes bug Something isn't working spec:XML https://www.w3.org/TR/xml11/
Projects
Status: Done
1 participant