Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] User-defined validation for the format keyword #158

Closed
Stranger6667 opened this issue Dec 16, 2020 · 16 comments · Fixed by #244
Closed

[FEATURE] User-defined validation for the format keyword #158

Stranger6667 opened this issue Dec 16, 2020 · 16 comments · Fixed by #244

Comments

@Stranger6667
Copy link
Owner

Stranger6667 commented Dec 16, 2020

At the moment, jsonschema supports the format keyword for formats defined in Drafts 4, 6, 7. However, JSON Schema allows custom values for the format keyword.

It will be awesome to provide a way for the end-user to validate their own formats.

As all validators are "compiled" to some structs (to avoid some run-time overhead e.g., map lookup, etc.), I think that the public API might look like this:

use jsonschema;
use std::collections::HashMap;
use serde_json::{json, Value};

struct CustomFormatValidator {}

// New trait somehow derived from `Validate`?
impl jsonschema::SomeNewTrait for CustomFormatValidator {
    fn is_valid(&self, _: &jsonschema::JSONSchema, instance: &Value) -> bool {
        if let Value::String(item) = instance {
            item == "bar"  // Format matching logic goes here
        } else {
            true
        }
    }
}

fn main() -> Result<(), jsonschema::CompilationError> {
    let schema = json!({"format": "custom"});
    let instance = json!("foo");
    let mut formats = HashMap::new();
    formats.insert(
        "custom": CustomFormatValidator
    );
    // Shortcuts
    jsonschema::is_valid(&schema, &instance, Some(formats))
    // Compiled schema
    let compiled = jsonschema::JSONSchema::options()
        .formats(formats)
        .compile(&schema)
        .expect("Invalid schema");
    compiled.is_valid(&instance)
}

It will be nice to avoid the boilerplate that extracts the actual string from Value::String, so the user won't write this code for each format.

Anticipated changes to implement this:

  • Take extra argument for the formats map in jsonschema::is_valid;
  • Extend CompilationOptions builder to handle formats;
  • Store the formats map as a part of CompilationContext;
  • Inside format.rs take a look at this mapping before going to take a look at the built-in validators;
  • Make the Validate trait public and implement some trait on top of it to avoid boilerplate?

For now, I think it is OK to keep the scope on the Rust code only. A similar feature for bindings could be implemented later.

@Stranger6667 Stranger6667 changed the title Custom format validators [FEATURE] User-defined validation for the format keyword Apr 28, 2021
@JCBurnside
Copy link

I will attempt to work on this on my own fork. Any comments on how you want the api to look? or is the proposed api acceptable?

@Stranger6667
Copy link
Owner Author

Stranger6667 commented Apr 29, 2021

Hi @JCBurnside !

Thank you for taking a step! :) I think that the most important measure is ergonomics and I'll be happy to discuss any other API design options :) Probably the draft above could be a good start for experimenting with the API design.

The end goal is probably to let the user specify only the validation function (maybe wrapped in some proc macro) and put it somehow to the compiled schema. So the user won't need to create a struct, implement trait, etc. Something like this maybe?

#[jsonschema::format]
fn custom(value: &str) -> bool {
    value == "bar"
}

fn main() {
    ...
    let compiled = jsonschema::JSONSchema::options()
        .format("custom", custom)
        .compile(&schema)
        .expect("Invalid schema");
}

@JCBurnside
Copy link

Hi,

Thanks for the quick response. That does seem like an elegant solution but shouldn't it take a value: &serde_json::Value not just a &str so they can also validate on types of value?

@Stranger6667
Copy link
Owner Author

Stranger6667 commented Apr 29, 2021

The thing is that the format validator applies only to strings, and other types are not affected by this validator at all. For example, the current boilerplate we have in formats.rs goes this way:

format_validator!(DateTimeValidator, "date-time");
impl Validate for DateTimeValidator {
    validate!("date-time");
    fn is_valid(&self, _: &JSONSchema, instance: &Value) -> bool {
        if let Value::String(item) = instance {
            DateTime::parse_from_rfc3339(item).is_ok()
        } else {
            true
        }
    }
}

All format validators return true there on all non Value::String cases.

So, as the type checking is generally redundant for this case, I think it should not be exposed to the end-user.

@JCBurnside
Copy link

I see. I will consider it.

@JCBurnside
Copy link

Problem with proc macros is to export them it must be the only type of thing you export at the moment. I could create a secondary crate that adds the feature?

@Stranger6667
Copy link
Owner Author

Yep, I think it could be jsonschema_derive and follow the same project structure as in, e.g. askama - a local dependency.

@JCBurnside
Copy link

Though do we even need the attribute? .format("name",function) could just accept name:&str, validation:Fn(&str)->bool

@Stranger6667
Copy link
Owner Author

Stranger6667 commented Apr 29, 2021

The main reason to have it is to avoid overhead (minor, though) at runtime. The underlying compilation process relies on structs that implement the Validate trait - they should be known at compile-time. However, it is possible to have such a built-in "container" struct for all user-defined format validators. But it will require a map lookup in runtime to determine which format function should be called. The general intuition I follow here is - if something could be done during the schema compilation step to avoid runtime cost, then take this direction unless the implementation is overcomplicated and hard to maintain.

Btw, with such an attribute, it will be possible to avoid tons of boilerplate in formats.rs.

@JCBurnside
Copy link

Problem i see with using the proc macro is instead of passing .validate the function then we have to pass it the generated struct. It should be possible I will have to digging. Perhaps the function ident could be used as the generated struct/const ident then the function could be embeded into the is_valid definition.

@Stranger6667
Copy link
Owner Author

Yep, it sounds like it could be solved this way, indeed.

@Stranger6667
Copy link
Owner Author

Stranger6667 commented Apr 29, 2021

For people who are coming here from "This Week In Rust 388" (as I submitted only this exact issue):

If you feel like you'd like to work on some issue - feel free to take a look at:

Or any other in the issue tracker, I'd be happy to provide support :)

@Stranger6667
Copy link
Owner Author

An extra thought about storing things in this snippet. I think that the macro can additionally generate a compilation function (similar to this one, but maybe simpler). Then it should be alright to store some format name -> compile function mapping

@Stranger6667
Copy link
Owner Author

Hi @JCBurnside ! Let me know if I can help here :) I saw your fork, it looks good to me :)

@JCBurnside
Copy link

I am not quite sure how to get what you suggested to work. I am struggling to understand how the system works as it is.

@Stranger6667
Copy link
Owner Author

@JCBurnside
Sorry for not getting back to you. Indeed, the option I proposed initially won't work (I tried to play with your commit and failed) :(
I implemented a simpler approach in #244, similar to what we have for contentEncoding.

I want to thank you for your efforts and sorry again for giving misleading advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants