Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown fields are denied for tagged newtype variant with unit struct or unit type #2304

Open
tage64 opened this issue Oct 21, 2022 · 6 comments

Comments

@tage64
Copy link
Contributor

tage64 commented Oct 21, 2022

This issue is closely related to #2303 .

Let say we have the following unit struct and internally tagged enum:

#[derive(Deserialize, Debug)]
struct ErrorStruct;

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum Response {
    Success(()),
    Error(ErrorStruct),
}

Here, {"result": "Success"} can be deserialized into Response::Success(()) and {"result": "Error"} into Response::Error(ErrorStruct). But if we add more unknown fields it doesn't work. We cannot for example deserialize {"result": "Error", "msg": "Something really bad"} into Response::Error(ErrorStruct).

This should indeed be the behaviour if the deny_unknown_fields attribute is present, but I think that extra unknown fields should be allowed otherwise. Or does someone think differently?

I would be willing to make a PR if this feature is accepted.

@tage64
Copy link
Contributor Author

tage64 commented Oct 23, 2022

First I must correct myself. The example enum in my previous comment should of course not has the deny_unknown_fields attribute. That was the entire point. (EDIT: This is corrected now.)

But I have thought a bit more about this. The reason it feels natural to allow unknown fields for internally tagged newtype variants over unit or unit structs is because we are deserializing/parsing from a map structure. The default behaviour if we deserialize into a struct type from a map is indeed to allow unknown fields, now we're still deserializing from a map so I think it makes alot sense to accept unknown fields here as well. Unless the deny_unknown_fields attribute is set of course.

But that makes us think, in what other cases do we deserialize something from a map and want to allow/deny unknown fields? And it turns out that internally tagged unit variants is such a case. However here it seems that unknown fields are allowed even though deny_unknown_fields is set. I.E. let's say we have the following enum:

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum AllowUnknownFields {
    Success,
}

#[derive(Deserialize, Debug)]
#[serde(tag = "result", deny_unknown_fields)]
enum DenyUnknownFields {
    Success,
}

Here both AllowUnknownFields::Success and DenyUnknownFields::Success can be deserialized from {"result": "success", "msg": "abc"}. But the intuition tells at least me that only AllowUnknownFields should be deserialized successfully from this input. Because msg is indeed an unknown field.

I split this into two features:

  1. Allow unknown fields when deserializing internally tagged newtype variants over unit and unit struct unless `deny_unknown_fields is set.
  2. Deny unknown fields when deserializing internally tagged unit variants if deny_unknown_fields is set.

I think that 1. is very easy to fix. I have the code in a fork here (on the master branch) and I can make a PR from that.

Number 2 is however a bit more contraversal since it might break existing code. I've also implemented this (just for fun) and you can find the code on my fork on the unit_variant_allow_unknown_fields branch.

I think that both 1 and 2 together is most consistant, but I'm not sure whether the possible breakage that might be introduced by 2 is acceptable.

@kangalio
Copy link

First I must correct myself. The example enum in my previous comment should of course not has the deny_unknown_fields attribute. That was the entire point.

Can you edit the original post to fix that mistake? The code snippet is quite confusing to read, currently

@kangalio
Copy link

kangalio commented Oct 31, 2022

#[derive(Deserialize, Debug)]
struct ErrorStruct;

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum Response {
    Success(()),
    Error(ErrorStruct),
}

Here, {"result": "Success"} can be deserialized into Response::Success(()) and {"result": "Error"} into Response::Error(ErrorStruct). But if we add more unknown fields it doesn't work. We cannot for example deserialize {"result": "Error", "msg": "Something really bad"} into Response::Error(ErrorStruct).

This issue is unrelated to enums. The fundamental reason is that serde parses JSON objects {}) into C-style structs struct Foo {} only. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=85abdc5bedcfb8ff0e0b9c4cfbfb9080

If you change the code to

#[derive(Deserialize, Debug)]
struct ErrorStruct {}

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum Response {
    Success(()),
    Error(ErrorStruct),
}

deserializing from {"result": "Error", "msg": "Something really bad"} works fine.

@tage64
Copy link
Contributor Author

tage64 commented Nov 16, 2022

Can you edit the original post to fix that mistake? The code snippet is quite confusing to read, currently

The original post is fixed now.

@tage64
Copy link
Contributor Author

tage64 commented Nov 16, 2022

This issue is unrelated to enums. The fundamental reason is that serde parses JSON objects {}) into C-style structs struct Foo {} only. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=85abdc5bedcfb8ff0e0b9c4cfbfb9080

If you change the code to

#[derive(Deserialize, Debug)]
struct ErrorStruct {}

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum Response {
    Success(()),
    Error(ErrorStruct),
}

deserializing from {"result": "Error", "msg": "Something really bad"} works fine.

But then unknown fields should neither be allowed on unit variants.

#[derive(Deserialize, Debug)]
#[serde(tag = "result")]
enum Response {
    Success,
    Error(()),
}

Now {"result": "Success", "msg": "abc"} can be deserialized into Response::Success but {"result": "Error", "msg": "abc"} cannot be deserialized into Response::Error(()). That just feels inconsistent to me.

@Mingun
Copy link
Contributor

Mingun commented Aug 11, 2023

It seems that this is a duplicate of #2294 or at least the reasons the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants