Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Metadata Validation #478

Merged
merged 173 commits into from Apr 28, 2022
Merged

Static Metadata Validation #478

merged 173 commits into from Apr 28, 2022

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Mar 16, 2022

Table of Contents

Static Metadata Validation

There are cases when a customer would generate the Runtime API from static metadata of one node and attempt interaction with another node.

This API proposal adds two validations points to capture any differences in the underlying metadata.

Call / Constant / Storage Validation

Nodes can register pallets in arbitrary order. This behavior will lead to different pallet representations inside the metadata but similar functionality.

Each pallet is composed of calls, constants, and storage.

Call / Constant / Storage is inspected recursively to obtain a unique deterministic representation. The representation is obtained via the subxt-metadata crate.

The hashing information is embedded statically inside the generated pallet.

The static information is checked against the runtime information obtained from the Client object.

API Changes

By default, the API will validate that a given cal / constant / storage is compatible between the static and dynamic metadata.

    let hash = api
        .tx()
        .balances()

        .transfer(dest, 123_456_789_012_345)?

        .sign_and_submit_default(&signer)
        .await?;

Full Metadata Validation

The metadata validation is implemented similarly to pallet validation.

Unlike the pallet validation, the metadata validation check is not performed by default. This is due to various small changes that would result in different metadata, as well as providing a simplified API to interact with.

The customer can call at any time the validation method for the full metadata check:

    // Validate full metadata compatibility.
    api.validate_metadata()?;

The MetadataError::IncompatiblePalletMetadata error is returned when trying to interact with a fully incompatible metadata.

To be noted that the hashing is compared only for the subset of pallets that are present in the statically generated API.

Static Metadata Crate

The subxt-metadata crate exposes the core implementation of static metadata validation.
The implementation follows recursively the metadata types towards flattening them out to a deterministic representation.
The resulted representation of the metadata is then hashed with SHA 256.
A future implementation can lift this restriction, providing String representations that are human-readable.

API

The following are exposed for customer usage:

  • get_metadata_hash()
  • get_metadata_per_pallet_hash()
  • get_pallet_hash()
  • get_call_hash()
  • get_constant_hash()
  • get_storage_hash()

The cache implementation is left entirely in the customers hands.
For examples see the subxt/metadata/hash_cache.rs.

Considerations

At the moment, the crate does not provide inner caching but does provide the ability to skip pallet hashes when the metadata hash is constructed first.

However, the crate does not cache the inner metadata types, obtained from the internal get_type_hash.
This behavior is intended for the time being for providing the ability to deterministically resolve recursive types.

struct A { b: struct B }
struct B { a: struct A }

When registering the previous types to the metadata, there are two possibilities, depending on which pallet registration order: metadata_types: { { "id": 0, A } , { "id": 1, B } } or metadata_types: { { "id": 0, B } , { "id": 1, A } }.

If intermediate caching is provided, resolving ID 0 from first example would result into 0 = (A B A) , 1 = (B A), while resolving ID 0 from the second example: 0 = (B A B), 1 = (A B).

The internal caching of types should be considered carefully, part of a separate issue.

Subxt Metadata CLI

The subxt CLI is extended with compatibility subcommand for validating metadata compatibility between multiple nodes.
The CLI uses the implementation exposed by the previous chapter to group compatible nodes, either by full metadata validation or by pallet validation.

Usage

USAGE:
    subxt-cli compatibility [OPTIONS]

OPTIONS:
        --nodes <nodes>...    Urls of the substrate nodes to verify for metadata compatibility
        --pallet <pallet>         Check the compatibility of metadata for a particular pallet

Full Metadata Validation

$> subxt-cli compatibility --nodes "http://localhost:9933","http://localhost:64550","http://localhost:64695"

Node "http://localhost:9933/" has metadata hash "6bdc100c38c2ac86aac57b2d27d1c07f4600be94632ca97e73d90cf39c74a52d"
Node "http://localhost:64550/" has metadata hash "af4236a2ef434e8eb9e83c41112dc7cd98be189d39efa9527658751377014218"
Node "http://localhost:64695/" has metadata hash "6bdc100c38c2ac86aac57b2d27d1c07f4600be94632ca97e73d90cf39c74a52d"

Compatible nodes
{
  "af4236a2ef434e8eb9e83c41112dc7cd98be189d39efa9527658751377014218": [
    "http://localhost:64550/"
  ],
  "6bdc100c38c2ac86aac57b2d27d1c07f4600be94632ca97e73d90cf39c74a52d": [
    "http://localhost:9933/",
    "http://localhost:64695/"
  ]
}

Pallet Metadata Validation

$> subxt-cli compatibility --nodes "http://localhost:9933","http://localhost:64550","http://localhost:64695"
--pallet Balances


Node "http://localhost:9933/" has pallet metadata hash "9f4a5ce4e35d25ad0934cefdc63c2260d88b910bdf408f6033a144ebadcd6124"
Node "http://localhost:64550/" has pallet metadata hash "00d7cf32d52a4a3db221a6b82edfbaac474a1e7139dbd362a1eabe6e60d61f15"
Node "http://localhost:64695/" has pallet metadata hash "9f4a5ce4e35d25ad0934cefdc63c2260d88b910bdf408f6033a144ebadcd6124"

Compatible nodes by pallet
{
  "palletPresent": {
    "00d7cf32d52a4a3db221a6b82edfbaac474a1e7139dbd362a1eabe6e60d61f15": [
      "http://localhost:64550/"
    ],
    "9f4a5ce4e35d25ad0934cefdc63c2260d88b910bdf408f6033a144ebadcd6124": [
      "http://localhost:9933/",
      "http://localhost:64695/"
    ]
  },
  "palletNotFound": []
}

Next Steps

  • Add ability to skip pallet validation (add skip_pallet_validation() methods)
  • Avoid recursion in MetadataHashable
  • Better caching for determined types
  • Add full metadata validation
  • Add tests
  • Move MetadataHashable to dedicated crate
  • Create CLI for comparing the metadata of nodes
  • API changes to avoid results when calling skip_validation functionality
  • Client persistent cache
  • (@jsdw) Investigate extension for per call inside pallet
  • (@jsdw) Remove extra Arc on client's metadata
  • (@jsdw) Add caching in subxt metadata

Future Work

  • Extend validation for events if required by customer use cases
  • Inner caching for subxt-metadata

Closes #398

lexnv added 16 commits March 14, 2022 17:27
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv requested review from ascjones, jsdw and dvdplm and removed request for ascjones March 16, 2022 17:01
Copy link
Collaborator

@jsdw jsdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good start! I've only skimmed through so far, and I'll have to take a more thorough look tomorrow, but I left some comments on the way!

I think ultimately I'd lean towards not having a MetadataHashable struct at all, and instead just having a set of functions which take a reference to metadata (or just the type registry if they don't need anything from Metadata) and return, at the end, a hash.

Most of the functions probably only need access to the type registry and some starting type ID I'd imagine, and from there they can recurse through the types in the registry and spit out a hash at the end.

Caching recursive IDs may give some weight to havign a struct at some point (although I wouldn't expect this struct to live longer than it takes to hash the pallet/metadata).

We may then choose to cache hashes (or not) as a separate concern (for example by having said cache inside a mutex in Metadata or something), to avoid re-hashing things each time we call them.

I also wonder whether std::hash::{ Hash, Hasher} might be useful at all (https://doc.rust-lang.org/std/hash/index.html) :)

codegen/src/api/metadata.rs Outdated Show resolved Hide resolved
examples/examples/balance_transfer.rs Outdated Show resolved Hide resolved
subxt/src/client.rs Outdated Show resolved Hide resolved
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
codegen/src/api/metadata.rs Outdated Show resolved Hide resolved
codegen/src/api/metadata.rs Outdated Show resolved Hide resolved
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@jsdw
Copy link
Collaborator

jsdw commented Apr 13, 2022

I had a pass over this, and think we're good to go to try and get this merged into Subxt.

The main features:

  • When you call tx/storage/constants, we'll do a validation check of that specific call to check that the types line up with the node being queried.
  • No more recursive Call handling stuff; if we try calling eg "sudo_sudo" on two nodes with different pallet orders, the validation check will fail on that call; our statically generated Call enum is a different shape, so it won't encode in a compatible way and must fail.
  • Metadata crate to expose the validation functions independently (I'd continue to consider this API unstable), and an extra CLI command which runs metadata or pallet checks.

Copy link
Member

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, missing some top-level integration tests for the validation failed cases.

This is a good improvement for safety and usablility. However reading over the hashing code I wonder whether we could make it even better for calls/events/storages by comparing some kind of "intermediate representation" (IR) instead of hashes. This would allow providing information of what actually has changed in those entities instead of just the fact that it has changed. The code for transalating the metadata to the IR could then shared with the codegen, avoiding duplicate code of traversing the metadata. This is probably a bit more involved so am happy to accept the hashing solution, it was just an idea that came to mind as I was looking at this.

subxt/src/metadata/hash_cache.rs Outdated Show resolved Hide resolved
subxt/src/metadata/hash_cache.rs Outdated Show resolved Hide resolved
codegen/src/api/calls.rs Show resolved Hide resolved
codegen/src/api/constants.rs Show resolved Hide resolved
codegen/src/api/mod.rs Show resolved Hide resolved
codegen/src/api/storage.rs Show resolved Hide resolved
metadata/Cargo.toml Show resolved Hide resolved
subxt/Cargo.toml Outdated Show resolved Hide resolved
.get(name)
.ok_or_else(|| MetadataError::PalletNotFound(name.to_string()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this useful to know which pallet that was not found in the error? Or am I missing something

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know which pallet you're looking for when you call the function, so if an error happens you already have that information (and there was this weird thing where some errors took String and some took &'static str and actually just removing them, since the info is there, felt cleaner than adding Cows or something :)

) -> Result<&PalletConstantMetadata<PortableForm>, MetadataError> {
self.constants
.get(key)
.ok_or(MetadataError::ConstantNotFound(key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, isn't useful to know which constant that failed in the error? Or are we expecting the caller to figure this out anyway?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above I think; we know what constant we're asking for when we call this funciton, so returning it in the error isn't necessary. (I have no objection to adding it back if we are consistent about String vs &'static str in our errors, but it felt easier to remove them and ignore that :))

Copy link
Member

@niklasad1 niklasad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks clean to me, I would prefer parking_lot::RwLock over StdRwLock though.

lexnv added 12 commits April 26, 2022 13:33
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Member

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 115 to 116
let metadata = rpc.metadata().await;
metadata?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let metadata = rpc.metadata().await;
metadata?
rpc.metadata().await?

/// Set the metadata.
///
/// *Note:* Metadata will no longer be downloaded from the runtime node.
pub fn set_metadata(mut self, metadata: Metadata) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn set_metadata(mut self, metadata: Metadata) -> Self {
#[cfg(test)]
pub fn set_metadata(mut self, metadata: Metadata) -> Self {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, cfg(test) is not active for integration tests. I would place this under integration_tests feature and follow up with #515 😄 .

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@jsdw
Copy link
Collaborator

jsdw commented Apr 28, 2022

However reading over the hashing code I wonder whether we could make it even better for calls/events/storages by comparing some kind of "intermediate representation" (IR) instead of hashes.

Thanks for your feedback @ascjones! Just to weigh in quickly; I think this IR approach is potentially very useful!

A possible downside of it for this validation stuff is that it is likely much more allocation heavy to produce an IR; the nice thing about the hashing approach is that we can have almost no allocations.

That said, we may end up wanting an IR for producing nice human readable diffs between metadata crates (or at least, I figure we'd either flatten the metadata types out into some enum that contained all of the relevant info itself and then compare those with eachother, or skip this and directly compare metadatas, depending on what works out to be simpler).

We'll always have room to re-visit the underlying validation approach we use here though, if we find a better way to do it, so I think of this as a starting point more than anything :)

@lexnv
Copy link
Contributor Author

lexnv commented Apr 28, 2022

Thanks a lot for your feedback @jsdw @ascjones and your help in getting the PR merged! 🎉

I do like the idea of intermediate representations for flattening out the metadata.
That would come in handy when extending the CLI with #522. Although it would be allocation-heavy it
would benefit greatly from human-readable diffs.

@lexnv lexnv merged commit 1fd1eee into master Apr 28, 2022
@lexnv lexnv deleted the 398_static_md_check branch April 28, 2022 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Static metadata validation
4 participants