Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement human-readable serde for Witness #1068

Merged
merged 6 commits into from Jul 18, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
91 changes: 79 additions & 12 deletions src/blockdata/witness.rs
Expand Up @@ -3,12 +3,13 @@
//! This module contains the [`Witness`] struct and related methods to operate on it
//!

use secp256k1::ecdsa;

use crate::blockdata::transaction::EcdsaSighashType;
use crate::consensus::encode::{Error, MAX_VEC_SIZE};
use crate::consensus::{Decodable, Encodable, WriteExt};
use crate::io::{self, Read, Write};
use crate::prelude::*;
use secp256k1::ecdsa;
use crate::VarInt;

/// The Witness is the data used to unlock bitcoins since the [segwit upgrade](https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki)
Expand Down Expand Up @@ -122,7 +123,6 @@ impl Encodable for Witness {
}

impl Witness {

/// Create a new empty [`Witness`]
pub fn new() -> Self {
Witness::default()
Expand Down Expand Up @@ -279,25 +279,77 @@ impl serde::Serialize for Witness {
where
S: serde::Serializer,
{
use serde::ser::SerializeSeq;
use hashes::hex::ToHex;
use serde::ser::SerializeSeq;
tcharding marked this conversation as resolved.
Show resolved Hide resolved

let human_readable = serializer.is_human_readable();
let mut seq = serializer.serialize_seq(Some(self.witness_elements))?;

for elem in self.iter() {
seq.serialize_element(&elem)?;
if human_readable {
seq.serialize_element(&elem.to_hex())?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allocates, however if we use collect_str it will be slower for default implementations because we can't reserve(). So I suggest not to worry about it at least for now.

} else {
seq.serialize_element(&elem)?;
}
}

seq.end()
}
}

#[cfg(feature = "serde")]
impl<'de> serde::Deserialize<'de> for Witness {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let vec: Vec<Vec<u8>> = serde::Deserialize::deserialize(deserializer)?;
Ok(Witness::from_vec(vec))
struct Visitor; // Human-readable visitor.
impl<'de> serde::de::Visitor<'de> for Visitor
{
type Value = Witness;

fn expecting(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
write!(f, "a sequence of hex arrays")
}

fn visit_seq<A: serde::de::SeqAccess<'de>>(self, mut a: A) -> Result<Self::Value, A::Error>
{
use hashes::hex::FromHex;
use hashes::hex::Error::*;
use serde::de::{self, Unexpected};

let mut ret = match a.size_hint() {
Some(len) => Vec::with_capacity(len),
None => Vec::new(),
};

while let Some(elem) = a.next_element::<String>()? {
let vec = Vec::<u8>::from_hex(&elem).map_err(|e| {
match e {
InvalidChar(b) => {
match core::char::from_u32(b.into()) {
Some(c) => de::Error::invalid_value(Unexpected::Char(c), &"a valid hex character"),
None => de::Error::invalid_value(Unexpected::Unsigned(b.into()), &"a valid hex character")
}
}
OddLengthString(len) => de::Error::invalid_length(len, &"an even length string"),
InvalidLength(expected, got) => {
let exp = format!("expected length: {}", expected);
de::Error::invalid_length(got, &exp.as_str())
}
}
})?;
ret.push(vec);
}
Ok(Witness::from_vec(ret))
}
}

if deserializer.is_human_readable() {
deserializer.deserialize_seq(Visitor)
} else {
let vec: Vec<Vec<u8>> = serde::Deserialize::deserialize(deserializer)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could avoid allocations if we used custom visitor. That's why I originally suggested to call the one above HRVisitor so that the other one can be called BinaryVisitor (suggest better name please).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Righto, I'll have a play with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must not be fully understanding where all the allocations are occurring. I came up with the code below but it seems to me that this code is going to contain the exact same number of allocations as line 343 above (one for each element, and the one outer vector).

            fn visit_seq<A: serde::de::SeqAccess<'de>>(self, mut a: A) -> Result<Self::Value, A::Error>
            {
                let mut ret = Vec::new();
                while let Some(elem) = a.next_element::<Vec<u8>>()? {
                    ret.push(elem);
                }
                Ok(Witness::from_vec(ret))
            }

What am I missing?

Copy link
Collaborator

@Kixunil Kixunil Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to build Witness as non-nested Vec directly. You need to use next_element_seed and a bit more machinery to support it. This example looks very similar to what we have here, you can pretty much copy it minus some generics (we have u8). Note that that example forgot to call reserve() in front of the inner for loop. Should have this:

if let Some(size_hint) = seq.size_hint() {
    self.0.reserve(size_hint);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a good play with this, was educational. Its separate from this PR because its an optimisation, right? I think the correct way to go about it is merge this naive version (fixes the ugly serialization) and then create an separate issue/PR to do it as in the example you link. Would that be fair?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care if it's merged in this one or a separate one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll do it as a separate PR. Cheers.

Ok(Witness::from_vec(vec))
}
}
}

Expand Down Expand Up @@ -422,20 +474,35 @@ mod test {

#[cfg(feature = "serde")]
#[test]
fn test_serde() {
use serde_json;
fn test_serde_bincode() {
use bincode;

let old_witness_format = vec![vec![0u8], vec![2]];
let new_witness_format = Witness::from_vec(old_witness_format.clone());

let old = serde_json::to_string(&old_witness_format).unwrap();
let new = serde_json::to_string(&new_witness_format).unwrap();
let old = bincode::serialize(&old_witness_format).unwrap();
let new = bincode::serialize(&new_witness_format).unwrap();

assert_eq!(old, new);

let back = serde_json::from_str(&new).unwrap();
let back: Witness = bincode::deserialize(&new).unwrap();
assert_eq!(new_witness_format, back);
}

#[cfg(feature = "serde")]
#[test]
fn test_serde_human() {
use serde_json;

let witness = Witness::from_vec(vec![vec![0u8, 123, 75], vec![2u8, 6, 3, 7, 8]]);

let json = serde_json::to_string(&witness).unwrap();

assert_eq!(json, r#"["007b4b","0206030708"]"#);

let back: Witness = serde_json::from_str(&json).unwrap();
assert_eq!(witness, back);
}
}


Expand Down