Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes in documentation + address some clippy warns + const fn #467

Merged
merged 6 commits into from Aug 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 11 additions & 4 deletions Cargo.toml
Expand Up @@ -59,11 +59,17 @@ async-tokio = ["tokio"]
## [standard compliant]: https://www.w3.org/TR/xml11/#charencoding
encoding = ["encoding_rs"]

## Enables support for recognizing all [HTML 5 entities](https://dev.w3.org/html5/html-author/charref)
## Enables support for recognizing all [HTML 5 entities] in [`unescape`] and
## [`unescape_with`] functions. The full list of entities also can be found in
## <https://html.spec.whatwg.org/entities.json>.
##
## [HTML 5 entities]: https://dev.w3.org/html5/html-author/charref
## [`unescape`]: crate::escape::unescape
## [`unescape_with`]: crate::escape::unescape_with
escape-html = []

## This feature enables support for deserializing lists where tags are overlapped
## with tags that do not correspond to the list.
## This feature for a serde deserializer that enables support for deserializing
## lists where tags are overlapped with tags that do not correspond to the list.
##
## When this feature is enabled, the XML:
## ```xml
Expand All @@ -75,7 +81,8 @@ escape-html = []
## </any-name>
## ```
## could be deserialized to a struct:
## ```ignore
## ```no_run
## # use serde::Deserialize;
## #[derive(Deserialize)]
## #[serde(rename_all = "kebab-case")]
## struct AnyName {
Expand Down
20 changes: 20 additions & 0 deletions Changelog.md
Expand Up @@ -41,6 +41,23 @@
- [#455]: Change return type of all `read_to_end*` methods to return a span between tags
- [#455]: Added `Reader::read_text` method to return a raw content (including markup) between tags
- [#459]: Added a `Writer::write_bom()` method for inserting a Byte-Order-Mark into the document.
- [#467]: The following functions made `const`:
- `Attr::key`
- `Attr::value`
- `Attributes::html`
- `Attributes::new`
- `BytesDecl::from_start`
- `Decoder::encoding`
- `LocalName::into_inner`
- `Namespace::into_inner`
- `Prefix::into_inner`
- `QName::into_inner`
- `Reader::buffer_position`
- `Reader::decoder`
- `Reader::get_ref`
- `Serializer::new`
- `Serializer::with_root`
- `Writer::new`

### Bug Fixes

Expand Down Expand Up @@ -183,6 +200,8 @@
- [#459]: Made the `Writer::write()` method non-public as writing random bytes to a document is not generally useful or desirable.
- [#459]: BOM bytes are no longer emitted as `Event::Text`. To write a BOM, use `Writer::write_bom()`.

- [#467]: Removed `Deserializer::new` because it cannot be used outside of the quick-xml crate

### New Tests

- [#9]: Added tests for incorrect nested tags in input
Expand Down Expand Up @@ -227,6 +246,7 @@
[#455]: https://github.com/tafia/quick-xml/pull/455
[#456]: https://github.com/tafia/quick-xml/pull/456
[#459]: https://github.com/tafia/quick-xml/pull/459
[#467]: https://github.com/tafia/quick-xml/pull/467

## 0.23.0 -- 2022-05-08

Expand Down
96 changes: 0 additions & 96 deletions README.md
Expand Up @@ -110,102 +110,6 @@ assert_eq!(result, expected.as_bytes());

When using the `serialize` feature, quick-xml can be used with serde's `Serialize`/`Deserialize` traits.

Here is an example deserializing crates.io source:

```rust
// Cargo.toml
// [dependencies]
// serde = { version = "1.0", features = [ "derive" ] }
// quick-xml = { version = "0.22", features = [ "serialize" ] }
use serde::Deserialize;
use quick_xml::de::{from_str, DeError};

#[derive(Debug, Deserialize, PartialEq)]
struct Link {
rel: String,
href: String,
sizes: Option<String>,
}

#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all = "lowercase")]
enum Lang {
En,
Fr,
De,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Head {
title: String,
#[serde(rename = "link", default)]
links: Vec<Link>,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Script {
src: String,
integrity: String,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Body {
#[serde(rename = "script", default)]
scripts: Vec<Script>,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Html {
lang: Option<String>,
head: Head,
body: Body,
}

fn crates_io() -> Result<Html, DeError> {
let xml = "<!DOCTYPE html>
<html lang=\"en\">
<head>
<meta charset=\"utf-8\">
<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">
<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">

<title>crates.io: Rust Package Registry</title>


<!-- EMBER_CLI_FASTBOOT_TITLE --><!-- EMBER_CLI_FASTBOOT_HEAD -->
<link rel=\"manifest\" href=\"/manifest.webmanifest\">
<link rel=\"apple-touch-icon\" href=\"/cargo-835dd6a18132048a52ac569f2615b59d.png\" sizes=\"227x227\">

<link rel=\"stylesheet\" href=\"/assets/vendor-8d023d47762d5431764f589a6012123e.css\" integrity=\"sha256-EoB7fsYkdS7BZba47+C/9D7yxwPZojsE4pO7RIuUXdE= sha512-/SzGQGR0yj5AG6YPehZB3b6MjpnuNCTOGREQTStETobVRrpYPZKneJwcL/14B8ufcvobJGFDvnTKdcDDxbh6/A==\" >
<link rel=\"stylesheet\" href=\"/assets/cargo-cedb8082b232ce89dd449d869fb54b98.css\" integrity=\"sha256-S9K9jZr6nSyYicYad3JdiTKrvsstXZrvYqmLUX9i3tc= sha512-CDGjy3xeyiqBgUMa+GelihW394pqAARXwsU+HIiOotlnp1sLBVgO6v2ZszL0arwKU8CpvL9wHyLYBIdfX92YbQ==\" >


<link rel=\"shortcut icon\" href=\"/favicon.ico\" type=\"image/x-icon\">
<link rel=\"icon\" href=\"/cargo-835dd6a18132048a52ac569f2615b59d.png\" type=\"image/png\">
<link rel=\"search\" href=\"/opensearch.xml\" type=\"application/opensearchdescription+xml\" title=\"Cargo\">
</head>
<body>
<!-- EMBER_CLI_FASTBOOT_BODY -->
<noscript>
<div id=\"main\">
<div class='noscript'>
This site requires JavaScript to be enabled.
</div>
</div>
</noscript>

<script src=\"/assets/vendor-bfe89101b20262535de5a5ccdc276965.js\" integrity=\"sha256-U12Xuwhz1bhJXWyFW/hRr+Wa8B6FFDheTowik5VLkbw= sha512-J/cUUuUN55TrdG8P6Zk3/slI0nTgzYb8pOQlrXfaLgzr9aEumr9D1EzmFyLy1nrhaDGpRN1T8EQrU21Jl81pJQ==\" ></script>
<script src=\"/assets/cargo-4023b68501b7b3e17b2bb31f50f5eeea.js\" integrity=\"sha256-9atimKc1KC6HMJF/B07lP3Cjtgr2tmET8Vau0Re5mVI= sha512-XJyBDQU4wtA1aPyPXaFzTE5Wh/mYJwkKHqZ/Fn4p/ezgdKzSCFu6FYn81raBCnCBNsihfhrkb88uF6H5VraHMA==\" ></script>

</body>
</html>
}";
let html: Html = from_str(xml)?;
assert_eq!(&html.head.title, "crates.io: Rust Package Registry");
Ok(html)
}
```

### Credits

This has largely been inspired by [serde-xml-rs](https://github.com/RReverser/serde-xml-rs).
Expand Down
2 changes: 1 addition & 1 deletion examples/custom_entities.rs
Expand Up @@ -33,7 +33,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
loop {
match reader.read_event() {
Ok(Event::DocType(ref e)) => {
for cap in entity_re.captures_iter(&e) {
for cap in entity_re.captures_iter(e) {
custom_entities.insert(
reader.decoder().decode(&cap[1])?.into_owned(),
reader.decoder().decode(&cap[2])?.into_owned(),
Expand Down
2 changes: 1 addition & 1 deletion src/de/escape.rs
Expand Up @@ -25,7 +25,7 @@ pub struct EscapedDeserializer<'a> {
}

impl<'a> EscapedDeserializer<'a> {
pub fn new(escaped_value: Cow<'a, [u8]>, decoder: Decoder, escaped: bool) -> Self {
pub const fn new(escaped_value: Cow<'a, [u8]>, decoder: Decoder, escaped: bool) -> Self {
EscapedDeserializer {
decoder,
escaped_value,
Expand Down
4 changes: 2 additions & 2 deletions src/de/map.rs
Expand Up @@ -621,13 +621,13 @@ where
break match self.map.de.peek()? {
// If we see a tag that we not interested, skip it
#[cfg(feature = "overlapped-lists")]
DeEvent::Start(e) if !self.filter.is_suitable(&e, decoder)? => {
DeEvent::Start(e) if !self.filter.is_suitable(e, decoder)? => {
self.map.de.skip()?;
continue;
}
// Stop iteration when list elements ends
#[cfg(not(feature = "overlapped-lists"))]
DeEvent::Start(e) if !self.filter.is_suitable(&e, decoder)? => Ok(None),
DeEvent::Start(e) if !self.filter.is_suitable(e, decoder)? => Ok(None),

// Stop iteration after reaching a closing tag
DeEvent::End(e) if e.name() == self.map.start.name() => Ok(None),
Expand Down
111 changes: 3 additions & 108 deletions src/de/mod.rs
@@ -1,109 +1,4 @@
//! Serde `Deserializer` module
//!
//! # Examples
//!
//! Here is a simple example parsing [crates.io](https://crates.io/) source code.
//!
//! ```
//! // Cargo.toml
//! // [dependencies]
//! // serde = { version = "1.0", features = [ "derive" ] }
//! // quick-xml = { version = "0.22", features = [ "serialize" ] }
//! # use pretty_assertions::assert_eq;
//! use serde::Deserialize;
//! use quick_xml::de::{from_str, DeError};
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! struct Link {
//! rel: String,
//! href: String,
//! sizes: Option<String>,
//! }
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! #[serde(rename_all = "lowercase")]
//! enum Lang {
//! En,
//! Fr,
//! De,
//! }
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! struct Head {
//! title: String,
//! #[serde(rename = "link", default)]
//! links: Vec<Link>,
//! }
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! struct Script {
//! src: String,
//! integrity: String,
//! }
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! struct Body {
//! #[serde(rename = "script", default)]
//! scripts: Vec<Script>,
//! }
//!
//! #[derive(Debug, Deserialize, PartialEq)]
//! struct Html {
//! lang: Option<String>,
//! head: Head,
//! body: Body,
//! }
//!
//! fn crates_io() -> Result<Html, DeError> {
//! let xml = "<!DOCTYPE html>
//! <html lang=\"en\">
//! <head>
//! <meta charset=\"utf-8\">
//! <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">
//! <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">
//!
//! <title>crates.io: Rust Package Registry</title>
//!
//!
//! <meta name=\"cargo/config/environment\" content=\"%7B%22modulePrefix%22%3A%22cargo%22%2C%22environment%22%3A%22production%22%2C%22rootURL%22%3A%22%2F%22%2C%22locationType%22%3A%22router-scroll%22%2C%22historySupportMiddleware%22%3Atrue%2C%22EmberENV%22%3A%7B%22FEATURES%22%3A%7B%7D%2C%22EXTEND_PROTOTYPES%22%3A%7B%22Date%22%3Afalse%7D%7D%2C%22APP%22%3A%7B%22name%22%3A%22cargo%22%2C%22version%22%3A%22b7796c9%22%7D%2C%22fastboot%22%3A%7B%22hostWhitelist%22%3A%5B%22crates.io%22%2C%7B%7D%2C%7B%7D%5D%7D%2C%22ember-cli-app-version%22%3A%7B%22version%22%3A%22b7796c9%22%7D%2C%22ember-cli-mirage%22%3A%7B%22usingProxy%22%3Afalse%2C%22useDefaultPassthroughs%22%3Atrue%7D%2C%22exportApplicationGlobal%22%3Afalse%7D\" />
//! <!-- EMBER_CLI_FASTBOOT_TITLE --><!-- EMBER_CLI_FASTBOOT_HEAD -->
//! <link rel=\"manifest\" href=\"/manifest.webmanifest\">
//! <link rel=\"apple-touch-icon\" href=\"/cargo-835dd6a18132048a52ac569f2615b59d.png\" sizes=\"227x227\">
//! <meta name=\"theme-color\" content=\"#f9f7ec\">
//! <meta name=\"apple-mobile-web-app-capable\" content=\"yes\">
//! <meta name=\"apple-mobile-web-app-title\" content=\"crates.io: Rust Package Registry\">
//! <meta name=\"apple-mobile-web-app-status-bar-style\" content=\"default\">
//!
//! <link rel=\"stylesheet\" href=\"/assets/vendor-8d023d47762d5431764f589a6012123e.css\" integrity=\"sha256-EoB7fsYkdS7BZba47+C/9D7yxwPZojsE4pO7RIuUXdE= sha512-/SzGQGR0yj5AG6YPehZB3b6MjpnuNCTOGREQTStETobVRrpYPZKneJwcL/14B8ufcvobJGFDvnTKdcDDxbh6/A==\" >
//! <link rel=\"stylesheet\" href=\"/assets/cargo-cedb8082b232ce89dd449d869fb54b98.css\" integrity=\"sha256-S9K9jZr6nSyYicYad3JdiTKrvsstXZrvYqmLUX9i3tc= sha512-CDGjy3xeyiqBgUMa+GelihW394pqAARXwsU+HIiOotlnp1sLBVgO6v2ZszL0arwKU8CpvL9wHyLYBIdfX92YbQ==\" >
//!
//!
//! <link rel=\"shortcut icon\" href=\"/favicon.ico\" type=\"image/x-icon\">
//! <link rel=\"icon\" href=\"/cargo-835dd6a18132048a52ac569f2615b59d.png\" type=\"image/png\">
//! <link rel=\"search\" href=\"/opensearch.xml\" type=\"application/opensearchdescription+xml\" title=\"Cargo\">
//! </head>
//! <body>
//! <!-- EMBER_CLI_FASTBOOT_BODY -->
//! <noscript>
//! <div id=\"main\">
//! <div class='noscript'>
//! This site requires JavaScript to be enabled.
//! </div>
//! </div>
//! </noscript>
//!
//! <script src=\"/assets/vendor-bfe89101b20262535de5a5ccdc276965.js\" integrity=\"sha256-U12Xuwhz1bhJXWyFW/hRr+Wa8B6FFDheTowik5VLkbw= sha512-J/cUUuUN55TrdG8P6Zk3/slI0nTgzYb8pOQlrXfaLgzr9aEumr9D1EzmFyLy1nrhaDGpRN1T8EQrU21Jl81pJQ==\" ></script>
//! <script src=\"/assets/cargo-4023b68501b7b3e17b2bb31f50f5eeea.js\" integrity=\"sha256-9atimKc1KC6HMJF/B07lP3Cjtgr2tmET8Vau0Re5mVI= sha512-XJyBDQU4wtA1aPyPXaFzTE5Wh/mYJwkKHqZ/Fn4p/ezgdKzSCFu6FYn81raBCnCBNsihfhrkb88uF6H5VraHMA==\" ></script>
//!
//!
//! </body>
//! </html>
//! }";
//! let html: Html = from_str(xml)?;
//! assert_eq!(&html.head.title, "crates.io: Rust Package Registr");
//! Ok(html)
//! }
//! ```

// Macros should be defined before the modules that using them
// Also, macros should be imported before using them
Expand Down Expand Up @@ -234,7 +129,7 @@ pub(crate) const UNFLATTEN_PREFIX: &str = "$unflatten=";
pub(crate) const PRIMITIVE_PREFIX: &str = "$primitive=";

/// Simplified event which contains only these variants that used by deserializer
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Eq)]
pub enum DeEvent<'a> {
/// Start tag (with attributes) `<tag attr="value">`.
Start(BytesStart<'a>),
Expand Down Expand Up @@ -361,7 +256,7 @@ where
///
/// - [`Deserializer::from_str`]
/// - [`Deserializer::from_reader`]
pub fn new(reader: R) -> Self {
fn new(reader: R) -> Self {
Deserializer {
reader,

Expand Down Expand Up @@ -448,7 +343,7 @@ where
self.read.push_front(self.reader.next()?);
}
if let Some(event) = self.read.front() {
return Ok(&event);
return Ok(event);
}
// SAFETY: `self.read` was filled in the code above.
// NOTE: Can be replaced with `unsafe { std::hint::unreachable_unchecked() }`
Expand Down
4 changes: 2 additions & 2 deletions src/de/simple_type.rs
Expand Up @@ -546,7 +546,7 @@ impl<'de, 'a> SimpleTypeDeserializer<'de, 'a> {

/// Constructor for tests
#[inline]
fn new(content: CowRef<'de, 'a>, escaped: bool, decoder: Decoder) -> Self {
const fn new(content: CowRef<'de, 'a>, escaped: bool, decoder: Decoder) -> Self {
Self {
content,
escaped,
Expand Down Expand Up @@ -1218,7 +1218,7 @@ mod tests {
fn to_utf16(string: &str) -> Vec<u8> {
let mut bytes = Vec::new();
for ch in string.encode_utf16() {
bytes.extend(&ch.to_le_bytes());
bytes.extend_from_slice(&ch.to_le_bytes());
}
bytes
}
Expand Down
2 changes: 1 addition & 1 deletion src/encoding.rs
Expand Up @@ -90,7 +90,7 @@ impl Decoder {
/// This encoding will be used by [`decode`].
///
/// [`decode`]: Self::decode
pub fn encoding(&self) -> &'static Encoding {
pub const fn encoding(&self) -> &'static Encoding {
self.encoding
}

Expand Down