Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "invalid type: map, expected a sequence" when deserialising #390

Closed
davystrong opened this issue May 29, 2022 · 16 comments
Closed
Labels
arrays Issues related to mapping XML content onto arrays using serde serde Issues related to mapping from Rust types to XML wontfix

Comments

@davystrong
Copy link

Hello. First of all, thanks for all the great work on this crate!

I'm trying to use the serde implementation to deserialise some XML that looks like this:

<Attribute Name="Example">
    <Array>
        <DataObject ObjectType="TestObject">A</DataObject>
        <DataObject ObjectType="TestObject">B</DataObject>
        <DataObject ObjectType="TestObject">C</DataObject>
    </Array>
</Attribute>

I would like to read this as:

struct Example {
    value: Vec<TestObject>,
}

Reading the DataObject as a TestObject is easy since all DataObjects in this context will be TestObjects. The Example part was harder, but I got it working using an enum and serde's tag. However, I can't get the Vec to read correctly. I'm getting the error Custom("invalid type: map, expected a sequence").

To make sure there wasn't anything more complex causing a problem, I also created an Array object which only contains the Vec, but still get the same error.

If I remove the parent Example object and just try to deserialise the Array, it works as expected.

I also tried serialising the data structure I want, and I get exactly the output I expect, but deserialising this output fails with the error. This is strange since (as far as I can tell) my serialisation logic is identical to my deserialisation logic.

I also have some code here that reproduces this last example:

Example
use std::{error::Error, fmt::Display};

use serde::{Serialize, Deserialize};
use serde_with::serde_as;

#[derive(Serialize, Deserialize, Debug)]
struct Array<T> {
    #[serde(rename = "DataObject")]
    items: Vec<T>,
}

#[derive(Serialize, Deserialize, Debug, Default)]
struct DataObject {
    #[serde(rename = "ObjectType")]
    object_type: String,

    #[serde(rename = "Attribute", default)]
    attributes: Vec<Attribute>,

    #[serde(rename = "$value")]
    data: String,
}

impl From<TestObject> for DataObject {
    fn from(value: TestObject) -> Self {
        Self { object_type: stringify!(TestObject).into(), attributes: Vec::new(), data: value.data }
    }
}

#[derive(Debug)]
struct BadObjectTypeError {
    expected_object_type: String,
    actual_object_type: String,
}

impl Display for BadObjectTypeError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(f, "Bad object type. Expected {}, found {}", self.expected_object_type, self.actual_object_type)
    }
}

impl Error for BadObjectTypeError {}

impl TryFrom<DataObject> for TestObject {
    type Error = BadObjectTypeError;

    fn try_from(value: DataObject) -> Result<Self, Self::Error> {
        if "TestObject" == value.object_type {
            Ok(Self {
                data: value.data,
            })
        } else {
            Err(BadObjectTypeError {
                expected_object_type: std::any::type_name::<Self>().into(),
                actual_object_type: value.object_type,
            })
        }
    }
}

#[derive(Serialize, Clone, Deserialize, Debug)]
#[serde(try_from = "DataObject", into = "DataObject")]
struct TestObject {
    #[serde(rename = "$value")]
    data: String,
}

#[serde_as]
#[derive(Serialize, Deserialize, Debug)]
#[serde(tag = "Name", rename_all = "SCREAMING_SNAKE_CASE")]
enum Attribute {
    Example {
        #[serde(rename = "Array")]
        value: Array<TestObject>
    }
}

fn main() {
    let attribute = Attribute::Example { value: Array { items: vec![TestObject { data: "A".into() }, TestObject { data: "B".into() }, TestObject { data: "C".into() }] } };
    let xml = quick_xml::se::to_string(&attribute).unwrap();
    println!("{}", &xml);

    let xd = &mut quick_xml::de::Deserializer::from_str(&xml);
    let import: Attribute = serde_path_to_error::deserialize(xd).unwrap();
    dbg!(&import);
}

Would really appreciate any guidance with this!

@Mingun
Copy link
Collaborator

Mingun commented May 29, 2022

Your XML can be mapped to that rust types:

#[serde(rename_all = "PascalCase")]
struct Attribute {
  name: String,
  array: Array,
}
#[serde(rename_all = "PascalCase")]
struct Array {
  data_object: Vec<DataObject>,
}
struct DataObject;

This is the basic struct, you can use some technics described in #365 (comment) to get rid of intermediate Array struct in your final types.

Using enum Attribute probably can not be possible due to serde-rs/serde#1183. We have a test for your situation:

quick-xml/tests/serde-de.rs

Lines 1101 to 1136 in 532990d

mod struct_ {
use super::*;
use pretty_assertions::assert_eq;
#[test]
#[ignore = "Prime cause: deserialize_any under the hood + https://github.com/serde-rs/serde/issues/1183"]
fn elements() {
let data: Node = from_str(
r#"<root><tag>Struct</tag><float>42</float><string>answer</string></root>"#,
)
.unwrap();
assert_eq!(
data,
Node::Struct {
float: "42".into(),
string: "answer".into()
}
);
}
#[test]
fn attributes() {
let data: Node = from_str(
// Comment for prevent unnecessary formatting - we use the same style in all tests
r#"<root tag="Struct" float="42" string="answer"/>"#,
)
.unwrap();
assert_eq!(
data,
Node::Struct {
float: "42".into(),
string: "answer".into()
}
);
}
}

@davystrong
Copy link
Author

@Mingun wow, thanks for getting back so quickly! Your code works, though as you say a simple adaptation to an enum with tag fails. I assume a custom deserialiser can't solve this problem?

Also, obviously the code I posted was a simplified example. I had been working with enums with custom tags before, but each item only help the $value field (text). Can you explain why this worked? Is this because it needs a buffer (of some sort) to use tag and the assumptions made by base serde happen to work with simple strings?

@Mingun
Copy link
Collaborator

Mingun commented May 30, 2022

Custom Deserialize implementation for Attribute can solve this. The problem in that serde first read all fields into internal buffer and then try to deserialize your struct from that buffer. When serde fills buffer it requests deserializer-provided types from the deserializer using deserialize_any. XML deserializer cannot distinguish between lonely item and a sequence without a hint (which is call to deserialize_seq / deserialize_tuple / deserialize_named_tuple), so

<Array>
  <DataObject .../>
  <DataObject .../>
  <DataObject .../><!-- only that element will buffered, overriding previous two -->
</Array>

always represented in a buffer something like

struct AnyName {
  DataObject: Map<serde::Value, serde::Value>,
}

So, then you try to deserialize

struct Array<T> {
    #[serde(rename = "DataObject")]
    items: Vec<T>,
}

from that content, Vec<T> tried to be deserialized from the Map<serde::Value, serde::Value> by call of Map::deserialize_seq, but that implementation cannot provide data and returns error invalid type: map, expected a sequence.

Can you explain why this worked?

Internally, XML has only two types -- String and map -- and only that two types can be deserialized after passing through serde buffering (i.e. after calling deserialize_any on deserializer). All other types supported only because our deserializer understands type hints that deserialized types provides to it using different deserialize_* methods. General-purpose deserializer from serde cannot do that, of course.

@Mingun Mingun added wontfix serde Issues related to mapping from Rust types to XML labels May 30, 2022
@davystrong
Copy link
Author

@Mingun thanks, that makes sense! I can't test this right now but I'll implement a custom deserialiser this evening. Thanks for your help! I'll close this issue.

@BratSinot
Copy link

Greetings!

So, no way to deserialize such XML into array without custom deserializer?

@davystrong
Copy link
Author

I didn't even manage it with a custom deserialiser (though there may be some way that I missed). I ended up using quick-xml directly without the serde interface. Not as nice, since you have to implement serialisation and deserialisation separately, but raw quick-xml is actually very nice to use.

@Mingun
Copy link
Collaborator

Mingun commented Jul 21, 2022

@BratSinot, as I already said, you can use deserialize_with, but you still need an intermediate struct (but it can be defined inside the helper function)

@BratSinot
Copy link

BratSinot commented Jul 22, 2022

@BratSinot, as I already said, you can use deserialize_with, but you still need an intermediate struct (but it can be defined inside the helper function)

But I didn't see how it would help in such case:

use serde::{Deserialize};

type BoxedStr = Box<str>;
type BoxedArray<T> = Box<[T]>;

#[derive(Debug, Deserialize)]
struct Data {
    #[serde(rename = "Value")]
    value: Value,
}

#[derive(Debug, Deserialize)]
struct Value {
    #[serde(flatten)]
    data: Actions,
}

#[derive(Debug, Deserialize)]
enum Actions {
    Foo(Foo)
}

#[derive(Debug, Deserialize)]
struct Foo {
    #[serde(rename = "Bar")]
    bar: Bar,
}

#[derive(Debug, Deserialize)]
struct Bar {
    #[serde(rename = "list")]
    lists: BoxedArray<List>,
}

#[derive(Debug, Deserialize)]
pub struct List {
    key: BoxedStr,
    /*#[serde(flatten)]
    values: BoxedArray<Value>*/
}

fn main() {
    let xml_str = r#"<Data>
    <Value>
            <Foo>
                <Bar>
                    <list key="section0">
                        <str value="value00" key="name00"/>
                        <str value="value01" key="name01"/>
                    </list>
                    <list key="section1">
                        <str value="value10" key="name10"/>
                        <str value="value11" key="name11"/>
                    </list>
                </Bar>
            </Foo>
    </Value>
</Data>"#;

    let data = quick_xml::de::from_str::<Data>(&xml_str).unwrap();

    println!("{data:?}");
}

@Mingun
Copy link
Collaborator

Mingun commented Jul 22, 2022

@BratSinot, could you explain, what your final struct should like that? Topicstarter had a problem with removing intermediate structs, but in your case I don't understand what intermediate structs you want to eliminate

@BratSinot
Copy link

BratSinot commented Jul 22, 2022

Topicstarter had a problem with removing intermediate structs

Oh, sorry, missed that part. I thought I found problem that similar to mine =/

Final should be look like that:

#[derive(Debug, Deserialize)]
struct Bar {
    list: BoxedArray<ListItem>,
}

#[derive(Debug, Deserialize)]
pub struct ListItem {
    key: BoxedStr,
    values: BoxedArray<Value>
}

#[derive(Debug, Deserialize)]
pub enum Value {
    #[serde(rename = "str")]
    Str {
        key: BoxedStr,
        value: BoxedStr,
    },
    
    #[serde(rename = "str")]
    Int {
        key: BoxedStr,
        value: u64,
    },
}

@Mingun
Copy link
Collaborator

Mingun commented Jul 22, 2022

Change ListItem to

#[derive(Debug, Deserialize)]
pub struct ListItem {
    key: BoxedStr,
    #[serde(rename = "$value")]
    values: BoxedArray<Value>
}

and in should work on master, thanks to #387.

Minor question: why Box<str> and Box<[T]> instead of String and Vec<T>? quick_xml does not support no_std mode where these types could be useful

@BratSinot
Copy link

Minor question: why Box<str> and Box<[T]> instead of String and Vec<T>? quick_xml does not support no_std mode where these types could be useful

Because I don't needed mutability, and it uses lesser memory. Differences between Box<str> / String and Box<[T]> / Vec<T> on 4 or 8 bytes (depends on CPU arch) because of capacity counter.

@BratSinot
Copy link

BratSinot commented Jul 22, 2022

Change ListItem to

#[derive(Debug, Deserialize)]
pub struct ListItem {
    key: BoxedStr,
    #[serde(rename = "$value")]
    values: BoxedArray<Value>
}

and in should work on master, thanks to #387.

Problem comes before ListItem:

#[derive(Debug, Deserialize)]
struct Bar {
    #[serde(rename = "$value")]
    list: BoxedArray<ListItem>,
}

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("missing field `$value`")', src/main.rs:60:58
stack backtrace:
quick-xml = { git = "https://github.com/tafia/quick-xml.git", rev = "ebbcce0bf7ce4ebff6d50c79dd7dc5545b813a9e", features = ["serialize"] }

@Mingun
Copy link
Collaborator

Mingun commented Jul 22, 2022

In Bar you don't need to use a special name $value, it is needed only when you have a list of enums. List of structs should be parsed without this rename. See the supported variants in the seq tests

@BratSinot
Copy link

In Bar you don't need to use a special name $value, it is needed only when you have a list of enums.

In that case, I got an error like Topicstarter:

#[derive(Debug, Deserialize)]
struct Bar {
    list: BoxedArray<ListItem>,
}

#[derive(Debug, Deserialize)]
pub struct ListItem {
    key: BoxedStr,
    /*#[serde(flatten)]
    values: BoxedArray<Value>*/
}

Custom("invalid type: map, expected a sequence")

@BratSinot
Copy link

BratSinot commented Jul 22, 2022

In Bar you don't need to use a special name $value, it is needed only when you have a list of enums. List of structs should be parsed without this rename. See the supported variants in the seq tests

Okey, seems I found the problem.

#[derive(Debug, Deserialize)]
struct Value {
-    #[serde(flatten)]
+    #[serde(rename = "$value")]
    data: Actions,
}

Seems what my next step will be to use this in my actual code =)

Thanks a lot for your time and help!
P.S. Yeah, I finally got what I want on actual data, big, big thanks! ^_^

@Mingun Mingun added the arrays Issues related to mapping XML content onto arrays using serde label Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays Issues related to mapping XML content onto arrays using serde serde Issues related to mapping from Rust types to XML wontfix
Projects
None yet
Development

No branches or pull requests

3 participants