Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using #[serde(untagged)] for Message enum #206

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

ryo33
Copy link
Contributor

@ryo33 ryo33 commented Feb 14, 2024

This is a draft because this uses not merged change from #204, which parses only textual message. it's merged and now this is ready for reviews

Related to:

#[serde(untagged)] does not work correctly with serde_json and some numeric types in the schema, without activating the arbitrary_precision feature of serde_json and using serde_json::Number instead of all usage of f64 or else.
In this change, I replaced the derived Deserialize with an implementation of FromStr by using serde_json in it.

Also, with this change, the error message will be more detailed compared to just "did not match any variant of untagged enum". Currently, it reports an error only about T for Message::Event.

src/conn.rs Outdated
Ok(msg)
}
Err(err) => {
tracing::debug!(target: "chromiumoxide::conn::raw_ws::parse_errors", msg = text, "Failed to parse raw WS message");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracing::debug!(target: "chromiumoxide::conn::raw_ws::parse_errors", msg = text, "Failed to parse raw WS message");
tracing::trace!(target: "chromiumoxide::conn::raw_ws::parse_errors", msg = text, "Failed to parse raw WS message");

This can potentially be a large steam of data, let's downgrade the log level here?

Copy link
Contributor

@MOZGIII MOZGIII left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@Sytten
Copy link
Contributor

Sytten commented Feb 14, 2024

Why not implement a custom serde deserializer instead? It would be more idiomatic than going through a FromStr.
Granted this is currently annoying to do without a clone because ContentRefDeserializer is private (serde-rs/serde#1947).

I would maybe throw the first error that is recorded. If we hit an "invalid" Response, it would be nicer to have a serde error of it instead of the a generic error because it tried to deserialize a Response into an Event (but that would mean an "invalid" event would get a generic error, so not super either). Something like:

let mut error = None;
match serde_json::from_str::<Response>(s) {
  Ok(v) => return Ok(v),
  Err(e) => { error = Some(e); }
}
match serde_json::from_str::<Event>(s) {
  Ok(v) => Ok(v),
  Err(e) => Err(error.unwrap_or(e))
}

@ryo33
Copy link
Contributor Author

ryo33 commented Feb 15, 2024

As the mention on the issue, I should provide a test data that emits error without this changes and ok with this change.
I prefer the FromStr one, because Message never be parsed as other format than JSON in real usage, and the implementation way that uses serde_json::Value supports only JSON while it looks like supporting many formats. But I'm not sure it's whether a conventional design choice in Rust or not. Also, I've not considered using ContentRefDeserializer or something works like it yet, and I may need to learn more about it, but I believe this change is sufficient as a permanent fix.

@MOZGIII
Copy link
Contributor

MOZGIII commented Feb 15, 2024

I see this state as the result of the following potential process:

  1. Implement the deserializer
  2. Se it clones, see issues with ContentRefDeserializer
  3. Implement FromStr as a workaround
  4. Now the deserializer is not used, drop it

So, it looks fine to me, although some additional TODOs and comments explaining this design might be useful.

The proper implementation, in my view, should use the deserializer, because then it is simply a lower level compared to the FromStr - but the FromStr could still be present and implemented atop the deserializer. But this is the ideal world so to speak.

@ryo33 ryo33 marked this pull request as ready for review February 15, 2024 01:55
@ryo33
Copy link
Contributor Author

ryo33 commented Feb 15, 2024

I've marked this as non-draft since the base change is merged now.

I've tried to reproduce #167 (comment) but failed, and I've not found any specific test case that ensures that this change fixes it yet.

I would maybe throw the first error that is recorded. If we hit an "invalid" Response, it would be nicer to have a serde error of it instead of the generic error because it tried to deserialize a Response into an Event (but that would mean an "invalid" event would get a generic error, so not super either).

I forgot to answer that.
IMO there are many options for reporting an error.

  1. Returns both with custom error type.
  2. Always return the Response one (proposed one)
  3. Always return the Event one (the current implementation in this pull request)
  4. Return a custom error says "did not match any variant of untagged enum" (like the derived deserializer on the main branch)
  5. Return one that reached to end of the input more than another, using line() and column() method on serde_json::Error.
  6. Use heuristics like returning if s.contains("\"method\"") { event_error } else { response_error }

I prefer the 3 of always returning the Event error, as I suppose it never fails to parse a JSON that is sent from Chromium and intended as a response, not an event, that means I can approximate "failure on parsing a Message equals to failure on parsing an Event". The reasoning is that the schema of Response is simple and clear, so it seems not to have any fragile part that source of unexpected compatibility issues.

The main branch is almost the same as the 4, so it should return a more detailed error after this was merged.

@ryo33
Copy link
Contributor Author

ryo33 commented Feb 15, 2024

I've implemented Deserialize without additional allocation using the RawValue feature in serde_json. I left the FromStr implementation because it's better in performance and error reporting than the Deserialize impl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants