You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We tried to configure JSoup as an XML parser and unparser. However JSoup does not seem to generate a valid output from an XML containing the escaped entity 
JSoup parses <?xml version=\"1.1\" encoding=\"UTF-8\"?><SomeText>This is an escaped escape-character: </SomeText> correctly.
However unparsing the parsed result returns <?xml version=\"1.1\" encoding=\"UTF-8\"?><SomeText>This is an escaped escape-character: ???</SomeText>
where ??? stands for a binary esc.
Other parsers (e.g. behind a web service) may refuse to parse this again :-(
Thus it would be nice, if JSoup Document.toString() would return valid XML. Simply escape everything < #x20 :-) (except lf, nl, ...)
jhy
added
bug
Confirmed bug that we should fix
and removed
needs-more-info
More information is needed from the reporter to progress the issue
labels
Aug 12, 2021
Thanks, fixed! I implemented the same escapes for both XML (required) and for HTML where it's optional, but I think will be easier to read and less surprising to escape these.
We tried to configure JSoup as an XML parser and unparser. However JSoup does not seem to generate a valid output from an XML containing the escaped entity 
Find below a demonstration of the problem.
Best regards
Michael
The text was updated successfully, but these errors were encountered: