Python: Support for `validate_all` function #606

HaloWorld · 2022-06-21T11:34:42Z

This change implements the validate_all function through the ast module in the python standard library, which allows PGV to return all validate error messages at once.

//entities.proto
message Person {
  uint64 id    = 1 [(validate.rules).uint64.gt    = 999];

  string email = 2 [(validate.rules).string.email = true];

  string name  = 3 [(validate.rules).string = {
                      pattern:   "^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$",
                      max_bytes: 256,
                   }];

  Location home = 4 [(validate.rules).message.required = true];

  message Location {
    double lat = 1 [(validate.rules).double = { gte: -90,  lte: 90 }];
    double lng = 2 [(validate.rules).double = { gte: -180, lte: 180 }];
  }
}

from entities_pb2 import Person
from protoc_gen_validate.validator import validate, ValidationFailed, print_validate, validate_all


p = Person(id=1000, name="Foo", email=".bar@bad.co", home=Person.Location(lat=0, lng=-999))
try:
    validate(p)
except ValidationFailed as err:
    print("validate(p):", err)
print("=" * 79)
try:
    validate_all(p)
except ValidationFailed as err:
    print("validate_all(p):", err)

print("=" * 79)
print(print_validate())

Result:

validate(p): p.email is not a valid email
===============================================================================
validate_all(p): 
p.email is not a valid email
p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$
p.lng is not in range (180.0, -180.0)
===============================================================================
# Validates Person
def generate_validate(p):
    if p.id <= 999:
        raise ValidationFailed("p.id is not greater than 999")
    if not _validateEmail(p.email):
        raise ValidationFailed("p.email is not a valid email")
    if byte_len(p.name) > 256:
        raise ValidationFailed("p.name length is greater than 256")
    if re.search(r'^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        raise ValidationFailed("p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$")
    if not _has_field(p, "home"):
        raise ValidationFailed("home is required.")
    if _has_field(p, "home"):
        embedded = validate(p.home)
        if embedded is not None:
            return embedded
    return None
# Validates Person All
def generate_validate_all(p):
    err = ''
    if p.id <= 999:
        err += '\np.id is not greater than 999'
    if not _validateEmail(p.email):
        err += '\np.email is not a valid email'
    if byte_len(p.name) > 256:
        err += '\np.name length is greater than 256'
    if re.search('^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        err += '\np.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$'
    if not _has_field(p, 'home'):
        err += '\nhome is required.'
    if _has_field(p, 'home'):
        err += _validate_all(p.home)
    return err
# Validates Location All
def generate_validate_all(p):
    err = ''
    if p.lat < -90.0 or p.lat > 90.0:
        err += '\np.lat is not in range (90.0, -90.0)'
    if p.lng < -180.0 or p.lng > 180.0:
        err += '\np.lng is not in range (180.0, -180.0)'
    return err

HaloWorld · 2022-07-12T04:14:34Z

Hi team.
It Looks like the May update of Protocol Buffers (Changes made in May 2022) broke python tests in ci, according to the description:

Version 4.21.0 is a new major version, following 3.20.1. The new version is based on the upb library

Python upb requires generated code that has been generated from protoc 3.19.0 or newer.

Maybe set the python protobuf dependency version to protobuf>=3.6.1,<3.20?

elliotmjackson

Thanks for your contribution @HaloWorld, I'm quite happy to get this into the main branch for release - i would like to see this put through the test harness if possible

CLAassistant · 2022-10-06T02:28:11Z

All committers have signed the CLA.

HaloWorld · 2022-10-07T03:06:51Z

A check for validate_all has been added to harness, which I have tested locally and passed, and currently has two checks:

Ensure that the result of validate_all is the same as validate
When validation fails, ensure that the first error message of validate_all is the same as the error message of validate

Also, since the ast.unparse function was not officially introduced until python 3.9, I had to bring in a third-party library (astunparse) to ensure compatibility.

HaloWorld · 2022-10-07T03:13:24Z

Here is an example of the functions currently generated by validate_all.

proto(tests/harness/cases/kitchen_sink.proto)

syntax = "proto3";

package tests.harness.cases;
option go_package = "github.com/envoyproxy/protoc-gen-validate/tests/harness/cases/go;cases";
import "validate/validate.proto";

import "google/protobuf/wrappers.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/any.proto";

enum ComplexTestEnum {
    ComplexZero = 0;
    ComplexONE  = 1;
    ComplexTWO  = 2;
}

message ComplexTestMsg {
    string                              const  = 1 [(validate.rules).string.const = "abcd"];
    ComplexTestMsg                      nested = 2;
    int32                               int_const = 3 [(validate.rules).int32.const = 5];
    bool                                bool_const = 4 [(validate.rules).bool.const = false];
    google.protobuf.FloatValue          float_val = 5 [(validate.rules).float.gt = 0];
    google.protobuf.Duration            dur_val = 6 [(validate.rules).duration.lt = {seconds: 17}, (validate.rules).duration.required = true];
    google.protobuf.Timestamp           ts_val = 7 [(validate.rules).timestamp.gt = {seconds: 7}];
    ComplexTestMsg                      another = 8;
    float                               float_const = 9 [(validate.rules).float.lt = 8];
    double                              double_in = 10 [(validate.rules).double = {in: [456.789, 123]}];
    ComplexTestEnum                     enum_const = 11 [(validate.rules).enum.const = 2];
    google.protobuf.Any                 any_val = 12 [(validate.rules).any = {in: ["type.googleapis.com/google.protobuf.Duration"]}];
    repeated google.protobuf.Timestamp  rep_ts_val = 13 [(validate.rules).repeated = { items { timestamp { gte { nanos: 1000000}}}}];
    map<sint32, string>                 map_val = 14 [(validate.rules).map.keys.sint32.lt = 0];
    bytes                               bytes_val = 15 [(validate.rules).bytes.const = "\x00\x99"];
    oneof o {
        option (validate.required) = true;

        string       x = 16;
        int32        y = 17;
    }
}

message KitchenSinkMessage { ComplexTestMsg val = 1; }

functions genereated by validate

# Validates KitchenSinkMessage
def generate_validate(p):
    if _has_field(p, "val"):
        embedded = validate(p.val)
        if embedded is not None:
            return embedded
    return None
# Validates ComplexTestMsg
def generate_validate(p):
    present = False
    if _has_field(p, "x"):
        present = True
    if _has_field(p, "y"):
        present = True
    if not present:
        raise ValidationFailed("Oneof o is required")
    if p.const != "abcd":
        raise ValidationFailed("p.const not equal to abcd")
    if _has_field(p, "nested"):
        embedded = validate(p.nested)
        if embedded is not None:
            return embedded
    if p.int_const != 5:
        raise ValidationFailed("p.int_const not equal to 5")
    if p.bool_const != False:
        raise ValidationFailed("p.bool_const not equal to False")
    if p.HasField("float_val"):
        if p.float_val.value <= 0.0:
            raise ValidationFailed("p.float_val.value is not greater than 0.0")
    if not _has_field(p, "dur_val"):
        raise ValidationFailed("p.dur_val is required.")
    if _has_field(p, "dur_val"):
        dur = p.dur_val.seconds + round((10**-9 * p.dur_val.nanos), 9)
        lt = 17.0
        if dur >= lt:
            raise ValidationFailed("p.dur_val is not lesser than 17.0")
    if _has_field(p, "ts_val"):
        ts = p.ts_val.seconds + round((10**-9 * p.ts_val.nanos), 9)
        gt = 7.0
        if ts <= gt:
            raise ValidationFailed("p.ts_val is not greater than 7.0")
    if _has_field(p, "another"):
        embedded = validate(p.another)
        if embedded is not None:
            return embedded
    if p.float_const >= 8.0:
        raise ValidationFailed("p.float_const is not lesser than 8.0")
    if p.double_in not in [456.789, 123.0]:
        raise ValidationFailed("p.double_in not in [456.789, 123.0]")
    if p.enum_const != 2:
        raise ValidationFailed("p.enum_const not equal to 2")
    if _has_field(p, "any_val"):
        if p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']:
            raise ValidationFailed("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        validate(item)
    for item in p.rep_ts_val:    
        if item:
            ts = item.seconds + round((10**-9 * item.nanos), 9)
            gte = 0.001
            if ts < gte:
                raise ValidationFailed("item is not greater than or equal to 0.001")
        pass
    for key in p.map_val:    
        if key >= 0:
            raise ValidationFailed("key is not lesser than 0")
        pass
    if p.bytes_val != b'\x00\x99':
        raise ValidationFailed("p.bytes_val not equal to b'\x00\x99'")
    return None

functions generated by validate_all

# Validates KitchenSinkMessage All
def generate_validate_all(p):
    err = []
    if _has_field(p, 'val'):
        err += _validate_all(p.val)
    return err
# Validates ComplexTestMsg All
def generate_validate_all(p):
    err = []
    present = False
    if _has_field(p, 'x'):
        present = True
    if _has_field(p, 'y'):
        present = True
    if (not present):
        err.append('Oneof o is required')
    if (p.const != 'abcd'):
        err.append('p.const not equal to abcd')
    if _has_field(p, 'nested'):
        err += _validate_all(p.nested)
    if (p.int_const != 5):
        err.append('p.int_const not equal to 5')
    if (p.bool_const != False):
        err.append('p.bool_const not equal to False')
    if p.HasField('float_val'):
        if (p.float_val.value <= 0.0):
            err.append('p.float_val.value is not greater than 0.0')
    if (not _has_field(p, 'dur_val')):
        err.append('p.dur_val is required.')
    if _has_field(p, 'dur_val'):
        dur = (p.dur_val.seconds + round(((10 ** (- 9)) * p.dur_val.nanos), 9))
        lt = 17.0
        if (dur >= lt):
            err.append('p.dur_val is not lesser than 17.0')
    if _has_field(p, 'ts_val'):
        ts = (p.ts_val.seconds + round(((10 ** (- 9)) * p.ts_val.nanos), 9))
        gt = 7.0
        if (ts <= gt):
            err.append('p.ts_val is not greater than 7.0')
    if _has_field(p, 'another'):
        err += _validate_all(p.another)
    if (p.float_const >= 8.0):
        err.append('p.float_const is not lesser than 8.0')
    if (p.double_in not in [456.789, 123.0]):
        err.append('p.double_in not in [456.789, 123.0]')
    if (p.enum_const != 2):
        err.append('p.enum_const not equal to 2')
    if _has_field(p, 'any_val'):
        if (p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']):
            err.append("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        err += _validate_all(item)
    for item in p.rep_ts_val:
        if item:
            ts = (item.seconds + round(((10 ** (- 9)) * item.nanos), 9))
            gte = 0.001
            if (ts < gte):
                err.append('item is not greater than or equal to 0.001')
        pass
    for key in p.map_val:
        if (key >= 0):
            err.append('key is not lesser than 0')
        pass
    if (p.bytes_val != b'\x00\x99'):
        err.append("p.bytes_val not equal to b'\x00\x99'")
    return err

HaloWorld

A check for validate_all has been added to harness, which I have tested locally and passed, and currently has two checks:

Ensure that the result of validate_all is the same as validate
When validation fails, ensure that the first error message of validate_all is the same as the error message of validate

Also, since the ast.unparse function was not officially introduced until python 3.9, I had to bring in a third-party library (astunparse) to ensure compatibility.

elliotmjackson · 2022-10-18T21:28:43Z

@HaloWorld, CI is unhappy with

ERROR: /go/src/github.com/envoyproxy/protoc-gen-validate/python/protoc_gen_validate/validator.py Imports are incorrectly sorted and/or formatted.

I'm interested in improving the dev experience on this repo, just out of interest did this not get picked up on your local?

HaloWorld

fix isort lint

HaloWorld · 2022-10-19T08:53:21Z

Sorry, I was previously unaware to the point that I could run ci directly locally with the make ci command, so I missed the lint check. Now I have fixed this issue and passed the ci test locally.

I'm relying on the Development section in README, I don't have bazel locally and I don't want to install bazel, so the only option for me is to build a docker image for testing.

docker build -t lyft/protoc-gen-validate .

Then I get to the container shell with the following command:

docker run --rm -ti -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then I ran the make build command and got the following result:

I followed the prompt and added the environment variable GOFLAGS="-buildvcs=false" with the following command:

docker run --rm -ti -e GOFLAGS="-buildvcs=false" -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then ran make build again, after which I ran the python tests with:

bazel test //tests/harness/executor:executor_python_test

After the test passed, I committed and pushed the code.

There are two things that I find most troublesome in the whole development process.

finding the entry for python test. I am not very familiar with bazel and go syntax, so this took me some time. Maybe explain how to run different tests for developers of different languages? If I just modified the python code, I don't think running the whole test with bazel is necessary, even if I need to run the whole test, I can run it before committing the PR.
add test cases against validate_all. Honestly, the current test set is not complete, it only guarantees that validate_all behaves the same as validate, but it does not guarantee that validate_all can find all validation exceptions. I know that the go version of PGV implements validateAll, but I can only see the first error message compared in the go test code(harnesss.go):

Signed-off-by: puyj <pyjapple@gmail.com>

* Refactor _Transformer to make the code structure more intuitive Split `_Transformer` into 7 sub-Transformers (in execution order). When the python version is lower than 3.9, use the `unparse` function of the third-party library `astunparse` to replace the `unparse` function of the standard library `ast`

* Add test for validate_all to Python harness The result of `validate` is guaranteed to be the same as the result of `validate_all`. When validation fails, the error message of `validate` is guaranteed to be the same as the first error message of `validate_all`.

elliotmjackson

LGTM thanks for your contribution @HaloWorld

@HaloWorld

Highlighted by our valued contributor @HaloWorld in #606, the `development` section of our readme needs some changes. This really does need a lot more work but should do the trick for now.

dkunitsk · 2022-11-15T20:47:59Z

Wanted to chime in to say thank you. This is a great change.

HaloWorld force-pushed the main branch from 0b68f46 to f8ec5f6 Compare June 21, 2022 11:48

HaloWorld changed the title ~~[Python] Support for validate_all function~~ Python: Support for validate_all function Jul 12, 2022

elliotmjackson added Enhancement Extend or improve functionality Python Python language support labels Sep 15, 2022

elliotmjackson suggested changes Sep 19, 2022

View reviewed changes

HaloWorld requested a review from elliotmjackson October 7, 2022 02:47

HaloWorld force-pushed the main branch from a44fcc2 to ddcd965 Compare October 17, 2022 03:00

HaloWorld commented Oct 18, 2022

View reviewed changes

HaloWorld commented Oct 19, 2022

View reviewed changes

elliotmjackson mentioned this pull request Oct 19, 2022

Improve Development section of readme #689

Merged

HaloWorld added 6 commits October 20, 2022 10:23

[Python] Implement validate_all function

2c1652b

Signed-off-by: puyj <pyjapple@gmail.com>

add _validate_all for embedded messages

4ac34bf

Signed-off-by: puyj <pyjapple@gmail.com>

Update example

0984e7e

fix lint

ef4f7b2

HaloWorld force-pushed the main branch from 9837ffb to ef4f7b2 Compare October 20, 2022 02:23

elliotmjackson enabled auto-merge (squash) October 25, 2022 15:52

elliotmjackson approved these changes Oct 25, 2022

View reviewed changes

elliotmjackson merged commit 65fff73 into bufbuild:main Oct 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Support for `validate_all` function #606

Python: Support for `validate_all` function #606

HaloWorld commented Jun 21, 2022

HaloWorld commented Jul 12, 2022

elliotmjackson left a comment

CLAassistant commented Oct 6, 2022 •

edited

HaloWorld commented Oct 7, 2022

HaloWorld commented Oct 7, 2022 •

edited

HaloWorld left a comment •

edited

elliotmjackson commented Oct 18, 2022

HaloWorld left a comment

HaloWorld commented Oct 19, 2022

elliotmjackson left a comment

dkunitsk commented Nov 15, 2022

Python: Support for validate_all function #606

Python: Support for validate_all function #606

Conversation

HaloWorld commented Jun 21, 2022

HaloWorld commented Jul 12, 2022

elliotmjackson left a comment

Choose a reason for hiding this comment

CLAassistant commented Oct 6, 2022 • edited

HaloWorld commented Oct 7, 2022

HaloWorld commented Oct 7, 2022 • edited

HaloWorld left a comment • edited

Choose a reason for hiding this comment

elliotmjackson commented Oct 18, 2022

HaloWorld left a comment

Choose a reason for hiding this comment

HaloWorld commented Oct 19, 2022

elliotmjackson left a comment

Choose a reason for hiding this comment

dkunitsk commented Nov 15, 2022

Python: Support for `validate_all` function #606

Python: Support for `validate_all` function #606

CLAassistant commented Oct 6, 2022 •

edited

HaloWorld commented Oct 7, 2022 •

edited

HaloWorld left a comment •

edited