Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional tests for borrowed input #509

Merged
merged 1 commit into from Nov 14, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
136 changes: 123 additions & 13 deletions tests/serde-de.rs
Expand Up @@ -30,19 +30,6 @@ where
result
}

#[test]
fn string_borrow() {
#[derive(Debug, Deserialize, PartialEq)]
struct BorrowedText<'a> {
#[serde(rename = "$text")]
text: &'a str,
}

let borrowed_item: BorrowedText = from_str("<text>Hello world</text>").unwrap();

assert_eq!(borrowed_item.text, "Hello world");
}

/// Tests for deserializing into specially named field `$text` which represent
/// textual content of an XML element
mod text {
Expand Down Expand Up @@ -6067,3 +6054,126 @@ fn from_str_should_ignore_encoding() {
}
);
}

/// Checks that deserializer is able to borrow data from the input
mod borrow {
use super::*;

/// Struct that should borrow input to be able to deserialize successfully.
/// serde implicitly borrow `&str` and `&[u8]` even without `#[serde(borrow)]`
#[derive(Debug, Deserialize, PartialEq)]
struct BorrowedElement<'a> {
string: &'a str,
}

/// Struct that should borrow input to be able to deserialize successfully.
/// serde implicitly borrow `&str` and `&[u8]` even without `#[serde(borrow)]`
#[derive(Debug, Deserialize, PartialEq)]
struct BorrowedAttribute<'a> {
#[serde(rename = "@string")]
string: &'a str,
}

/// Deserialization of all XML's in that module expected to pass because
/// unescaping is not required, so deserialized `Borrowed*` types can hold
/// references to the original buffer with an XML text
mod non_escaped {
use super::*;
use pretty_assertions::assert_eq;

#[test]
fn top_level() {
let data: &str = from_str(r#"<root>without escape sequences</root>"#).unwrap();
assert_eq!(data, "without escape sequences",);
}

#[test]
fn element() {
let data: BorrowedElement = from_str(
r#"
<root>
<string>without escape sequences</string>
</root>"#,
)
.unwrap();
assert_eq!(
data,
BorrowedElement {
string: "without escape sequences",
}
);
}

#[test]
fn attribute() {
let data: BorrowedAttribute =
from_str(r#"<root string="without escape sequences"/>"#).unwrap();
assert_eq!(
data,
BorrowedAttribute {
string: "without escape sequences",
}
);
}
}

/// Deserialization of all XML's in that module expected to fail because
/// values requires unescaping that will lead to allocating an internal
/// buffer by deserializer, but the `Borrowed*` types couldn't take ownership
/// on it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concept of opportunistic zero-copy with Cow<&str>? As someone not terribly familiar with serde.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, Cow<str> will borrow whenever possible i. e. when deserializer can provide data of underlying source by calling Visitor::visit_borrowed_str. The opposite case when the deserializer allocates String by itself and calls Visitor::visit_string. The intermediate case when deserializer provides borrow to the data, that it owns, via Visitor::visit_str.

In our case we can either provide access to the XML source by calling Visitor::visit_borrowed_str, or allocate a new String (because we in any case need to allocate to do unescaping and/or decoding) and give it to the Visitor. If XML in UTF-8 and does not contain escaped data or mixed text/CDATA or CDATA/CDATA content (after solving #474), the data will be borrowed

///
/// The same behavior demonstrates the `serde_json` crate
mod escaped {
use super::*;
use pretty_assertions::assert_eq;

#[test]
fn top_level() {
match from_str::<&str>(
r#"<root>with escape sequence: &lt;</root>"#,
) {
Err(DeError::Custom(reason)) => assert_eq!(
reason,
"invalid type: string \"with escape sequence: <\", expected a borrowed string"
),
e => panic!(
"Expected `Err(Custom(invalid type: string \"with escape sequence: <\", expected a borrowed string))`, but found {:?}",
e
),
}
}

#[test]
fn element() {
match from_str::<BorrowedElement>(
r#"
<root>
<string>with escape sequence: &lt;</string>
</root>"#,
) {
Err(DeError::Custom(reason)) => assert_eq!(
reason,
"invalid type: string \"with escape sequence: <\", expected a borrowed string"
),
e => panic!(
"Expected `Err(Custom(invalid type: string \"with escape sequence: <\", expected a borrowed string))`, but found {:?}",
e
),
}
}

#[test]
fn attribute() {
match from_str::<BorrowedAttribute>(r#"<root string="with &quot;escape&quot; sequences"/>"#) {
Err(DeError::Custom(reason)) => assert_eq!(
reason,
"invalid type: string \"with \\\"escape\\\" sequences\", expected a borrowed string"
),
e => panic!(
"Expected `Err(Custom(invalid type: string \"with \"escape\" sequences\", expected a borrowed string))`, but found {:?}",
e
),
}
}
}
}