diff --git a/src/de/mod.rs b/src/de/mod.rs index 8fc70d84..743ba04b 100644 --- a/src/de/mod.rs +++ b/src/de/mod.rs @@ -1,5 +1,1306 @@ //! Serde `Deserializer` module //! +//! Due to complexity of the XML standard and that fact that serde was developed +//! with JSON in mind, not all serde concepts smoothly laid down on XML. This +//! leads to that fact that some XML concepts are inexpressible in terms of serde +//! derives and you may require implement deserialization manually. +//! +//! The most notable restriction is distinguish between _elements_ and _attributes_, +//! no other serde format haven't such a conception. +//! +//! Due to that the mapping is performed in the best effort manner. +//! +//! +//! +//! Mapping XML to Rust types +//! ========================= +//! +//! - [Optional attributes and elements](#optional-attributes-and-elements) +//! - [Choices (`xs:choice` XML Schema type)](#choices-xschoice-xml-schema-type) +//! - [Sequences (`xs:all` and `xs:sequence` XML Schema types)](#sequences-xsall-and-xssequence-xml-schema-types) +//! +//! Type names are never considered when deserializing, so you could name your +//! types as you wish. Other general rules: +//! - `struct` field name could be represented in XML only as an attribute name +//! or an element name; +//! - `enum` variant name could be represented in XML only as an attribute name +//! or an element name; +//! - the unit struct, unit type `()` and unit enum variant can be deserialized +//! from any valid XML content: +//! - attribute and element names; +//! - attribute and element values; +//! - text or CDATA content (including mixed text + CDATA content). +//! +//!
To parse all these XML's... | ...use that Rust type(s) |
---|---|
+//! Content of attributes and text/CDATA content of elements (including mixed +//! text and CDATA content): +//! +//! ```xml +//! <... ...="content" /> +//! ``` +//! ```xml +//! <...>content +//! ``` +//! ```xml +//! <...> +//! ``` +//! ```xml +//! <...>texttext +//! ``` +//! | +//!
+//!
+//! You can use any type that can be deserialized from an `&str`, for example:
+//! - [`String`] and [`&str`]
+//! - [`u32`], [`f32`] and other numeric types
+//! - `enum`s, like
+//! ```ignore
+//! // FIXME: #474, merging mixed text / CDATA
+//! // content does not work yet
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! enum Language {
+//! Rust,
+//! Cpp,
+//! #[serde(other)]
+//! Other,
+//! }
+//! # #[derive(Debug, PartialEq, Deserialize)]
+//! # struct X { #[serde(rename = "$text")] x: Language }
+//! # assert_eq!(X { x: Language::Rust }, quick_xml::de::from_str("
+//!
+//! NOTE: deserialization to non-owned types (i.e. borrow from the input),
+//! such as `&str`, is possible only if you parse document in the UTF-8
+//! encoding and content do not contains XML escape sequences, such as `<`
+//! or entity references, as well as text content does not represented by mixed
+//! text / CDATA elements.
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! |
+//!
+//! +//! Content of attributes and text / CDATA content of elements (including mixed +//! text and CDATA content), which represents a space-delimited lists, as +//! specified in the XML Schema specification for [`xs:list`] `simpleType`: +//! +//! ```xml +//! <... ...="element1 element2 ..." /> +//! ``` +//! ```xml +//! <...> +//! element1 +//! element2 +//! ... +//! +//! ``` +//! ```xml +//! <...> +//! ``` +//! [`xs:list`]: https://www.w3.org/TR/xmlschema11-2/#list-datatypes +//! | +//!
+//!
+//! Use any type that deserialized using [`deserialize_seq()`] call, for example:
+//!
+//! ```
+//! // FIXME: #474, merging mixed text / CDATA
+//! // content does not work yet
+//! type List = Vec
+//!
+//! NOTE: according to the XML Schema restrictions, you cannot escape those
+//! white-space characters, so list elements are _always_ will not contain them.
+//! In practice you will usually use `xs:list`s for lists of numbers or enumerated
+//! values which looks like identifiers in many languages, for example, `item`,
+//! `some_item` or `some-item`, so that shouldn't be a problem.
+//!
+//! NOTE: according to the XML Schema specification, list elements can be
+//! delimited only by spaces. Other delimiters (for example, commas) are not
+//! allowed.
+//!
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! [`deserialize_seq()`]: de::Deserializer::deserialize_seq
+//! |
+//!
+//! A typical XML with attributes. The root tag name does not matter:
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure where an each XML attribute are mapped to a field with a name
+//! started from `@`. Because an `@` character is not allowed in the Rust identifier,
+//! you should use `#[serde(rename = "@...")]` attribute to rename it.
+//! Name of the struct itself does not matter:
+//!
+//! ```
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # type U = ();
+//! // Get both attributes
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@one")]
+//! one: T,
+//!
+//! #[serde(rename = "@two")]
+//! two: U,
+//! }
+//! # quick_xml::de::from_str::
+//!
+//! NOTE: XML allows you to have an attribute and an element with the same name
+//! inside the one element. quick-xml deals with that by prepending a `@` prefix
+//! to the name of attributes.
+//!
+//! |
+//!
+//! A typical XML with child elements. The root tag name does not matter:
+//!
+//! ```xml
+//! |
+//!
+//! A structure where an each XML child element are mapped to the field.
+//! Each element name becomes a name of field. Name of the struct itself
+//! does not matter:
+//!
+//! ```
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # type U = ();
+//! // Get both elements
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! one: T,
+//! two: U,
+//! }
+//! # quick_xml::de::from_str::
+//!
+//! NOTE: XML allows you to have an attribute and an element with the same name
+//! inside the one element. quick-xml deals with that by prepending a `@` prefix
+//! to the name of attributes.
+//!
+//! |
+//!
+//! An XML with an attribute and a child element named equally:
+//!
+//! ```xml
+//! |
+//!
+//!
+//! You MUST specify `#[serde(rename = "@field")]` on a field that will be used
+//! for an attribute.
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # type U = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@field")]
+//! attribute: T,
+//! field: U,
+//! }
+//! # assert_eq!(
+//! # AnyName { attribute: (), field: () },
+//! # quick_xml::de::from_str(r#"
+//! # |
+//!
+//! +//! ## Optional attributes and elements +//! +//! | |
To parse all these XML's... | ...use that Rust type(s) |
+//! An optional XML attribute that you want to capture.
+//! The root tag name does not matter.
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure with an optional field, renamed according to the requirements
+//! for attributes:
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@optional")]
+//! optional: Option |
+//!
+//! An optional XML elements that you want to capture.
+//! The root tag name does not matter.
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure with an optional field:
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! optional: Option |
+//!
+//! +//! ## Choices (`xs:choice` XML Schema type) +//! +//! | |
To parse all these XML's... | ...use that Rust type(s) |
+//! An XML with different root tag names:
+//!
+//! ```xml
+//! |
+//!
+//!
+//! An enum where each variant have a name of the possible root tag. Name of
+//! the enum itself does not matter.
+//!
+//! All these structs can be used to deserialize from any XML on the
+//! left side depending on amount of information that you want to get:
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # type U = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum AnyName {
+//! One { #[serde(rename = "@field1")] field1: T },
+//! Two { field2: U },
+//! }
+//! # assert_eq!(AnyName::One { field1: () }, quick_xml::de::from_str(r#" |
+//!
+//!
+//! ` |
+//!
+//!
+//! A structure with a field which type is an `enum`.
+//!
+//! Names of the enum, struct, and struct field with `Choice` type does not matter.
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@field")]
+//! field: T,
+//!
+//! #[serde(rename = "$value")]
+//! any_name: Choice,
+//! }
+//! # assert_eq!(
+//! # AnyName { field: (), any_name: Choice::One },
+//! # quick_xml::de::from_str(r#" |
+//!
+//!
+//! ` |
+//!
+//!
+//! A structure with a field which type is an `enum`.
+//!
+//! Names of the enum, struct, and struct field with `Choice` type does not matter.
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! field: T,
+//!
+//! #[serde(rename = "$value")]
+//! any_name: Choice,
+//! }
+//! # assert_eq!(
+//! # AnyName { field: (), any_name: Choice::One },
+//! # quick_xml::de::from_str(r#"
+//!
+//! NOTE: if your `Choice` enum would contain an `#[serde(other)]`
+//! variant, element `
+//!
+//! |
+//!
+//!
+//! ` |
+//!
+//!
+//! A structure with a field of an intermediate type with one field of `enum` type.
+//! Actually, this example is not necessary, because you can construct it by yourself
+//! using composition rules that was described above. However the XML construction,
+//! described here, are very common, so it is shown explicitly.
+//!
+//! Names of the enum and struct does not matter.
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct Holder {
+//! #[serde(rename = "$value")]
+//! any_name: Choice,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@field")]
+//! field: T,
+//!
+//! choice: Holder,
+//! }
+//! # assert_eq!(
+//! # AnyName { field: (), choice: Holder { any_name: Choice::One } },
+//! # quick_xml::de::from_str(r#" |
+//!
+//!
+//! ` |
+//!
+//!
+//! A structure with a field of an intermediate type with one field of `enum` type.
+//! Actually, this example is not necessary, because you can construct it by yourself
+//! using composition rules that was described above. However the XML construction,
+//! described here, are very common, so it is shown explicitly.
+//!
+//! Names of the enum and struct does not matter.
+//!
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type T = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct Holder {
+//! #[serde(rename = "$value")]
+//! any_name: Choice,
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! field: T,
+//!
+//! choice: Holder,
+//! }
+//! # assert_eq!(
+//! # AnyName { field: (), choice: Holder { any_name: Choice::One } },
+//! # quick_xml::de::from_str(r#" |
+//!
+//! +//! ## Sequences (`xs:all` and `xs:sequence` XML Schema types) +//! +//! | |
To parse all these XML's... | ...use that Rust type(s) |
+//! A sequence inside a tag without a dedicated name:
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure with a field which have a sequence type, for example, [`Vec`].
+//! Because XML syntax does not distinguish between empty sequences and missed
+//! elements, we should indicate that on a Rust side, because serde will require
+//! that field `item` exists. You can do that in two possible ways:
+//!
+//! Use the `#[serde(default)]` attribute for a field or the entire struct:
+//! ```
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type Item = ();
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(default)]
+//! item: Vec
+//!
+//! This bug is tracked in [#510]
+//!
+//! |
+//!
+//! A sequence with a strict order, probably with a mixed content
+//! (text/CDATA and tags):
+//!
+//! ```xml
+//!
+//!
+//! NOTE: this is just an example for showing mapping. XML does not allow
+//! multiply root tags -- you should wrap the sequence to a tag.
+//!
+//! |
+//!
+//!
+//! All elements mapped to the heterogeneous sequential type: tuple or named tuple.
+//! Each element of the tuple should be able to be deserialized from the nested
+//! element content (`...`), except the enum types which would be deserialized
+//! from the full element (`
+//!
+//! NOTE: consequent text and CDATA nodes are merged into the one text node,
+//! so you cannot have two adjacent string types in your sequence.
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! ```ignore
+//! // FIXME: #474
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type One = ();
+//! # type Two = ();
+//! # /*
+//! type One = ...;
+//! type Two = ...;
+//! # */
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName(One, String, Two, One);
+//! # assert_eq!(
+//! # AnyName((), "text cdata".into(), (), ()),
+//! # quick_xml::de::from_str(r#" |
+//!
+//! A sequence with a non-strict order, probably with a mixed content
+//! (text/CDATA and tags).
+//!
+//! ```xml
+//! |
+//!
+//! A homogeneous sequence of elements with a fixed or dynamic size.
+//!
+//!
+//!
+//! NOTE: consequent text and CDATA nodes are merged into the one text node,
+//! so you cannot have two adjacent string types in your sequence.
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! ```ignore
+//! // FIXME: #474
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! #[serde(other)]
+//! Other,
+//! }
+//! type AnyName = [Choice; 4];
+//! # assert_eq!(
+//! # [Choice::One, Choice::Other, Choice::Two, Choice::One],
+//! # quick_xml::de::from_str:: |
+//!
+//! A sequence with a strict order, probably with a mixed content,
+//! (text and tags) inside of the other element.
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure where all child elements mapped to the one field which have
+//! a heterogeneous sequential type: tuple or named tuple. Each element of the
+//! tuple should be able to be deserialized from the full element (`
+//!
+//! NOTE: consequent text and CDATA nodes are merged into the one text node,
+//! so you cannot have two adjacent string types in your sequence.
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! ```ignore
+//! // FIXME: #474, Custom("duplicate field `$value`")
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # type One = ();
+//! # type Two = ();
+//! # /*
+//! type One = ...;
+//! type Two = ...;
+//! # */
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@attribute")]
+//! # attribute: (),
+//! # /*
+//! attribute: ...,
+//! # */
+//! // Does not (yet?) supported by the serde
+//! // https://github.com/serde-rs/serde/issues/1905
+//! // #[serde(flatten)]
+//! #[serde(rename = "$value")]
+//! any_name: (One, String, Two, One),
+//! }
+//! # assert_eq!(
+//! # AnyName { attribute: (), any_name: ((), "text cdata".into(), (), ()) },
+//! # quick_xml::de::from_str("\
+//! # |
+//!
+//! A sequence with a non-strict order, probably with a mixed content
+//! (text/CDATA and tags) inside of the other element.
+//!
+//! ```xml
+//! |
+//!
+//!
+//! A structure where all child elements mapped to the one field which have
+//! a homogeneous sequential type: array-like container. A container type `T`
+//! should be able to be deserialized from the nested element content (`...`),
+//! except if it is an enum type which would be deserialized from the full
+//! element (`
+//!
+//! NOTE: consequent text and CDATA nodes are merged into the one text node,
+//! so you cannot have two adjacent string types in your sequence.
+//!
+//!
+//!
+//! Merging of the text / CDATA content is tracked in the issue [#474] and
+//! will be available in the next release.
+//!
+//!
+//! ```ignore
+//! // FIXME: Custom("unknown variant `text`, expected
+//! // one of `one`, `two`, `$value`")
+//! # use pretty_assertions::assert_eq;
+//! # use serde::Deserialize;
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! #[serde(rename_all = "snake_case")]
+//! enum Choice {
+//! One,
+//! Two,
+//! #[serde(rename = "$value")]
+//! Other(String),
+//! }
+//! # #[derive(Debug, PartialEq)]
+//! #[derive(Deserialize)]
+//! struct AnyName {
+//! #[serde(rename = "@attribute")]
+//! # attribute: (),
+//! # /*
+//! attribute: ...,
+//! # */
+//! // Does not (yet?) supported by the serde
+//! // https://github.com/serde-rs/serde/issues/1905
+//! // #[serde(flatten)]
+//! #[serde(rename = "$value")]
+//! any_name: [Choice; 4],
+//! }
+//! # assert_eq!(
+//! # AnyName { attribute: (), any_name: [
+//! # Choice::One,
+//! # Choice::Other("text cdata".into()),
+//! # Choice::Two,
+//! # Choice::One,
+//! # ] },
+//! # quick_xml::de::from_str("\
+//! # |
+//!