Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml serde roundtrip loses CR/LF encoding #670

Open
francisdb opened this issue Oct 19, 2023 · 5 comments
Open

xml serde roundtrip loses CR/LF encoding #670

francisdb opened this issue Oct 19, 2023 · 5 comments
Labels
serde Issues related to mapping from Rust types to XML

Comments

@francisdb
Copy link

francisdb commented Oct 19, 2023

use pretty_assertions::assert_eq;
use serde::{Deserialize, Serialize};

#[derive(Debug, Deserialize, Serialize)]
struct Root {
    #[serde(rename = "@value")]
    value: String,
}

const xml: &str = r#"<root value="new&#xD;&#xA;line"/>"#;

#[test]
fn test_attribute_value_normalization() {
    let read: Root = quick_xml::de::from_str(xml).unwrap();
    assert_eq!("new\r\nline", read.value);
    let written = quick_xml::se::to_string(&read).unwrap();
    assert_eq!(r#"<root value="new&#xD;&#xA;line"/>"#, written);
}

yields

thread 'test_attribute_value_normalization' panicked at tests/atest.rs:18:5:
assertion failed: `(left == right)`

Diff < left / right > :
<<root value="new&#xD;&#xA;line"/>
><Root value="new
>line"/>

Any workaround possible for the serializer?

Is this related to #115 ?

@Mingun Mingun added the serde Issues related to mapping from Rust types to XML label Oct 19, 2023
@Mingun
Copy link
Collaborator

Mingun commented Oct 19, 2023

No, this is more related to #371. Normalization of attributes does not handled correctly yet neither in Writer or serde. The only workaround is to use #[serde(serialize_with)] and normalize value yourself.

@francisdb
Copy link
Author

@Mingun if I normalize in serialize_with to &#xA; I end up with &amp;#xA; as probably afterwards there is another normalize step? Any workaround for the workaround 😅?

@Mingun
Copy link
Collaborator

Mingun commented Oct 20, 2023

Uh... doesn't think about that. One possible workaround for that is to make two-step escaping:

  • using serialize_with replace necessary symbols to some markers (for example, \n -> \u0001#xA;
  • using custom implementation of std::fmt::Write replace these markers with actual escape codes (for example, \u0001 -> &). You can be sure, that the whole marker will be written in one call to write_str, so multicharacter markers are not a problem

Or modify internals of quick-xml and use patched version. If you'll feel that those modification are not hackish, make a PR 😄 .

@francisdb
Copy link
Author

Or modify internals of quick-xml and use patched version. If you'll feel that those modification are not hackish, make a PR 😄 .

would that not duplicate efforts already done in #379 ?

@dralley
Copy link
Collaborator

dralley commented Oct 20, 2023

At the end of the day #379 just needs to be finished. I'll try to find time for it this weekend but if anyone else wants to take it over I'm also OK with that. I've been extremely busy with other work unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
serde Issues related to mapping from Rust types to XML
Projects
None yet
Development

No branches or pull requests

3 participants