Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated choice plus other field #683

Open
Gaelan opened this issue Nov 17, 2023 · 6 comments · May be fixed by #702 or #736
Open

Repeated choice plus other field #683

Gaelan opened this issue Nov 17, 2023 · 6 comments · May be fixed by #702 or #736
Labels
arrays Issues related to mapping XML content onto arrays using serde bug help wanted namespaces Issues related to namespaces support serde Issues related to mapping from Rust types to XML

Comments

@Gaelan
Copy link

Gaelan commented Nov 17, 2023

I've got some XML that looks like this:

<outer>
  <one />
  <two />
  <three />
  <one />
  <field>5</field> <!-- optional -->
</outer>

Which I want to parse with something like:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice { One, Two, Three }
#[derive(Deserialize)]
struct Outer {
  #[serde(rename_all = "$value")]
  choices: Vec<Choice>,
  field: Option<u32>
}

But this fails, with:

unknown variant `field`, expected one of `One`, `Two`, `Three`

Is there a trick to making this work? I could just put field into the Choice enum, but that doesn't really make sense semantically, so I'd rather not.

@Mingun
Copy link
Collaborator

Mingun commented Nov 17, 2023

Could you post all code that reproduces the problem (ideally as Rust testcase)? Because we have exactly this testcase here:

/// In those tests non-sequential field is defined in the struct
/// after sequential, so it will be deserialized after the list.
/// That struct should be deserialized from the XML where these
/// fields comes in an arbitrary order
mod field_after_list {
use super::*;
use pretty_assertions::assert_eq;
#[derive(Debug, PartialEq, Deserialize)]
struct Root {
#[serde(rename = "$value")]
item: Vec<Choice>,
node: (),
}

@Mingun Mingun added serde Issues related to mapping from Rust types to XML arrays Issues related to mapping XML content onto arrays using serde labels Nov 17, 2023
@Gaelan
Copy link
Author

Gaelan commented Nov 17, 2023

Huh, strange.

The data in question is from the UK's train status API.

#[derive(Deserialize, Debug)]
enum ScheduleLocation {
    #[serde(rename = "OR")]
    Origin,
    #[serde(rename = "OPOR")]
    OperationalOrigin,
    #[serde(rename = "PP")]
    PassingPoint,
    #[serde(rename = "IP")]
    IntermediatePoint,
    #[serde(rename = "OPIP")]
    OperationalIntermediatePoint,
    #[serde(rename = "DT")]
    Destination,
    #[serde(rename = "OPDT")]
    OperationalDestination,
}

#[derive(Deserialize, Debug)]
struct Schedule {
    #[serde(rename = "@rid")]
    rid: String,
    cancelReason: Option<u32>,
    #[serde(rename = "$value")]
    locations: Vec<ScheduleLocation>,
}

#[derive(Deserialize, Debug)]
enum Event {
    #[serde(rename = "schedule")]
    Schedule(Schedule),
    #[serde(rename = "association")]
    Association,
    #[serde(rename = "TS")]
    TrainStatus,
    #[serde(rename = "trainOrder")]
    TrainOrder,
    #[serde(rename = "OW")]
    StationMessage,
    #[serde(rename = "trainAlert")]
    TrainAlert,
    #[serde(rename = "trackingID")]
    TrackingId,
    #[serde(rename = "alarm")]
    Alarm,
    #[serde(rename = "scheduleFormations")]
    ScheduleFormations,
    #[serde(rename = "formationLoading")]
    FormationLoading,
}

#[derive(Deserialize, Debug)]
struct UpdateInner {
    #[serde(rename = "$value")]
    events: Vec<Event>,
}

#[derive(Deserialize, Debug)]
enum UpdateType {
    // updates are contained in either an sR or uR element; we don't actually care which
    #[serde(rename = "sR")]
    Sr(UpdateInner),
    #[serde(rename = "uR")]
    Ur(UpdateInner),
}

#[derive(Deserialize, Debug)]
struct PportUpdate {
    #[serde(rename = "$value")]
    inner: UpdateType,
}

#[test]
fn quick_xml_weirdness() {
    let xml = r#"
        <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
        <Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v16" xmlns:ns2="http://www.thalesgroup.com/rtti/PushPort/Schedules/v3" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Schedules/v2" xmlns:ns4="http://www.thalesgroup.com/rtti/PushPort/Formations/v2" xmlns:ns5="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v3" xmlns:ns6="http://www.thalesgroup.com/rtti/PushPort/Formations/v1" xmlns:ns7="http://www.thalesgroup.com/rtti/PushPort/StationMessages/v1" xmlns:ns8="http://www.thalesgroup.com/rtti/PushPort/TrainAlerts/v1" xmlns:ns9="http://www.thalesgroup.com/rtti/PushPort/TrainOrder/v1" xmlns:ns10="http://www.thalesgroup.com/rtti/PushPort/TDData/v1" xmlns:ns11="http://www.thalesgroup.com/rtti/PushPort/Alarms/v1" xmlns:ns12="http://thalesgroup.com/RTTI/PushPortStatus/root_1" ts="2023-11-16T21:14:11.2579109Z" version="16.0">
            <sR>
                <schedule rid="202311168055118" uid="P55118" trainId="1C61" ssd="2023-11-16" toc="NT" trainCat="XX">
                    <ns2:OR wtd="19:29" tpl="MNCRIAP" act="TB" can="true" ptd="19:29"/>
                    <ns2:PP wtp="19:30:30" tpl="HLDGWJ" can="true"/>
                    <ns2:PP wtp="19:31:30" tpl="HLDG" can="true"/>
                    <ns2:IP wta="19:34" wtd="19:34:30" tpl="GATLEY" act="T " can="true" pta="19:34" ptd="19:34"/>
                    <ns2:PP wtp="19:36" tpl="EDIDBRY" can="true"/>
                    <ns2:PP wtp="19:37" tpl="BAGE" can="true"/>
                    <ns2:PP wtp="19:38" tpl="MLDTHRD" can="true"/>
                    <ns2:PP wtp="19:40:30" tpl="SLDLJN" can="true"/>
                    <ns2:PP wtp="19:44" tpl="ARDWCKJ" can="true"/>
                    <ns2:IP wta="19:47" wtd="19:50" tpl="MNCRPIC" act="T " can="true" pta="19:47" ptd="19:50"/>
                    <ns2:IP wta="19:51:30" wtd="19:53:30" tpl="MNCROXR" act="T " can="true" pta="19:52" ptd="19:53"/>
                    <ns2:IP wta="19:55" wtd="19:56" tpl="MNCRDGT" act="T " can="true" pta="19:55" ptd="19:56"/>
                    <ns2:PP wtp="19:57" tpl="WATSTJN" can="true"/>
                    <ns2:PP wtp="19:58" tpl="ORDSLLJ" can="true"/>
                    <ns2:PP wtp="19:59" tpl="SLFDCT" can="true"/>
                    <ns2:IP wta="20:07:30" wtd="20:08:30" tpl="BOLTON" act="T " can="true" pta="20:08" ptd="20:08"/>
                    <ns2:PP wtp="20:12" tpl="LOSTCKJ" can="true"/>
                    <ns2:PP wtp="20:13" tpl="HORWICH" can="true"/>
                    <ns2:PP wtp="20:15" tpl="BLRD" can="true"/>
                    <ns2:PP wtp="20:16" tpl="ADNL" can="true"/>
                    <ns2:IP wta="20:19:30" wtd="20:20:30" tpl="CHORLEY" act="T " can="true" pta="20:20" ptd="20:20"/>
                    <ns2:PP wtp="20:24" tpl="CHORBUK" can="true"/>
                    <ns2:PP wtp="20:26" tpl="EUXTONJ" can="true"/>
                    <ns2:PP wtp="20:30" tpl="PRSTRJN" can="true"/>
                    <ns2:IP wta="20:31:30" wtd="20:33:30" tpl="PRST" act="T " can="true" pta="20:32" ptd="20:33"/>
                    <ns2:PP wtp="20:34:30" tpl="PRSTNFJ" can="true"/>
                    <ns2:PP wtp="20:38" tpl="BBGHGL" can="true"/>
                    <ns2:PP wtp="20:41" tpl="GSTANG" can="true"/>
                    <ns2:IP wta="20:50" wtd="20:51" tpl="LANCSTR" act="T " can="true" pta="20:50" ptd="20:51"/>
                    <ns2:PP wtp="20:53" tpl="MORCMSJ" can="true"/>
                    <ns2:PP wtp="20:58" tpl="CRNFNJN" can="true"/>
                    <ns2:IP wta="20:59:30" wtd="21:00:30" tpl="CRNF" act="T " can="true" pta="21:00" ptd="21:00"/>
                    <ns2:IP wta="21:05" wtd="21:06" tpl="SDAL" act="T " can="true" pta="21:05" ptd="21:06"/>
                    <ns2:IP wta="21:09:30" wtd="21:10:30" tpl="ARNSIDE" act="T " can="true" pta="21:10" ptd="21:10"/>
                    <ns2:IP wta="21:15" wtd="21:16" tpl="GOVS" act="T " can="true" pta="21:15" ptd="21:16"/>
                    <ns2:IP wta="21:19" wtd="21:20" tpl="KTBK" act="T " can="true" pta="21:19" ptd="21:20"/>
                    <ns2:IP wta="21:23:30" wtd="21:24" tpl="CARK" act="T " can="true" pta="21:24" ptd="21:24"/>
                    <ns2:IP wta="21:30:30" wtd="21:31:30" tpl="ULVRSTN" act="T " can="true" pta="21:31" ptd="21:31"/>
                    <ns2:IP wta="21:39" wtd="21:40" tpl="DALTON" act="T " can="true" pta="21:39" ptd="21:40"/>
                    <ns2:PP wtp="21:41:30" tpl="DALTONJ" can="true"/>
                    <ns2:IP wta="21:45:30" wtd="21:46:30" tpl="ROOSE" act="T " can="true" pta="21:46" ptd="21:46"/>
                    <ns2:DT wta="21:53" pta="21:53" tpl="BAROW" act="TF" can="true"/>
                    <ns2:cancelReason>918</ns2:cancelReason>
                </schedule>
            </sR>
        </Pport>"#;
    let result = quick_xml::de::from_str::<PportUpdate>(xml);
    dbg!(&result);
    assert!(result.is_ok());
}

Results in:

running 1 test
test quick_xml_weirdness ... FAILED

failures:

---- quick_xml_weirdness stdout ----
[src/main.rs:139] &result = Err(
    Custom(
        "unknown variant `cancelReason`, expected one of `OR`, `OPOR`, `PP`, `IP`, `OPIP`, `DT`, `OPDT`",
    ),
)
thread 'quick_xml_weirdness' panicked at 'assertion failed: result.is_ok()', src/main.rs:140:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    quick_xml_weirdness

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

@Mingun
Copy link
Collaborator

Mingun commented Nov 17, 2023

The problem is in namespace prefix. Namespaces are not properly supported in serde yet. If you remove ns2: from cancelReason here, the test is passed:

#[test]
fn issue683() {
    #[derive(Deserialize, Debug, PartialEq)]
    enum ScheduleLocation {
        #[serde(rename = "DT")]
        Destination,
    }

    #[derive(Deserialize, Debug, PartialEq)]
    struct Schedule {
        cancelReason: Option<u32>,
        #[serde(rename = "$value")]
        locations: Vec<ScheduleLocation>,
    }
    let xml = r#"
        <schedule xmlns:ns2="http://www.thalesgroup.com/rtti/PushPort/Schedules/v3">
            <ns2:DT/>
            <ns2:cancelReason>918</ns2:cancelReason>
        </schedule>"#;
    let result = quick_xml::de::from_str::<Schedule>(xml);
    dbg!(&result);
    assert_eq!(result.unwrap(), Schedule {
        cancelReason: Some(918),
        locations: vec![ScheduleLocation::Destination],
    });
}

@Mingun Mingun added bug help wanted namespaces Issues related to namespaces support labels Nov 17, 2023
@Mingun
Copy link
Collaborator

Mingun commented Nov 17, 2023

Because your XML use namespaces, try to look at xmlserde which should support namespaces better. It is based on quick-xml, so should be as fast as quick-xml's deserializer.

@Gaelan
Copy link
Author

Gaelan commented Nov 17, 2023

Thanks!

(I don't actually care about the namespaces, so I'd be equally happy just stripping them; but there doesn't seem to be an easy way to do this.)

@leftmostcat
Copy link

I'm also experiencing this issue and, as with Gaelan, would be happy if namespaces could simply be ignored by the deserializer. xmlserde requires being significantly more explicit about structure, which in my case is impractical.

Changing https://github.com/tafia/quick-xml/blob/master/src/de/map.rs#L792 to use local_name() instead of name() solves the issue for me and all serde_de(_.*)? tests remain passing. Is that a reasonable solution, or is there a situation I'm missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays Issues related to mapping XML content onto arrays using serde bug help wanted namespaces Issues related to namespaces support serde Issues related to mapping from Rust types to XML
Projects
None yet
3 participants