Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Support for validate_all function #606

Merged
merged 6 commits into from Oct 25, 2022

Conversation

HaloWorld
Copy link
Contributor

This change implements the validate_all function through the ast module in the python standard library, which allows PGV to return all validate error messages at once.

//entities.proto
message Person {
  uint64 id    = 1 [(validate.rules).uint64.gt    = 999];

  string email = 2 [(validate.rules).string.email = true];

  string name  = 3 [(validate.rules).string = {
                      pattern:   "^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$",
                      max_bytes: 256,
                   }];

  Location home = 4 [(validate.rules).message.required = true];

  message Location {
    double lat = 1 [(validate.rules).double = { gte: -90,  lte: 90 }];
    double lng = 2 [(validate.rules).double = { gte: -180, lte: 180 }];
  }
}
from entities_pb2 import Person
from protoc_gen_validate.validator import validate, ValidationFailed, print_validate, validate_all


p = Person(id=1000, name="Foo", email=".bar@bad.co", home=Person.Location(lat=0, lng=-999))
try:
    validate(p)
except ValidationFailed as err:
    print("validate(p):", err)
print("=" * 79)
try:
    validate_all(p)
except ValidationFailed as err:
    print("validate_all(p):", err)

print("=" * 79)
print(print_validate())

Result:

validate(p): p.email is not a valid email
===============================================================================
validate_all(p): 
p.email is not a valid email
p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$
p.lng is not in range (180.0, -180.0)
===============================================================================
# Validates Person
def generate_validate(p):
    if p.id <= 999:
        raise ValidationFailed("p.id is not greater than 999")
    if not _validateEmail(p.email):
        raise ValidationFailed("p.email is not a valid email")
    if byte_len(p.name) > 256:
        raise ValidationFailed("p.name length is greater than 256")
    if re.search(r'^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        raise ValidationFailed("p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$")
    if not _has_field(p, "home"):
        raise ValidationFailed("home is required.")
    if _has_field(p, "home"):
        embedded = validate(p.home)
        if embedded is not None:
            return embedded
    return None
# Validates Person All
def generate_validate_all(p):
    err = ''
    if p.id <= 999:
        err += '\np.id is not greater than 999'
    if not _validateEmail(p.email):
        err += '\np.email is not a valid email'
    if byte_len(p.name) > 256:
        err += '\np.name length is greater than 256'
    if re.search('^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        err += '\np.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$'
    if not _has_field(p, 'home'):
        err += '\nhome is required.'
    if _has_field(p, 'home'):
        err += _validate_all(p.home)
    return err
# Validates Location All
def generate_validate_all(p):
    err = ''
    if p.lat < -90.0 or p.lat > 90.0:
        err += '\np.lat is not in range (90.0, -90.0)'
    if p.lng < -180.0 or p.lng > 180.0:
        err += '\np.lng is not in range (180.0, -180.0)'
    return err

@HaloWorld
Copy link
Contributor Author

Hi team.
It Looks like the May update of Protocol Buffers (Changes made in May 2022) broke python tests in ci, according to the description:

  • Version 4.21.0 is a new major version, following 3.20.1. The new version is based on the upb library
  • Python upb requires generated code that has been generated from protoc 3.19.0 or newer.

image

Maybe set the python protobuf dependency version to protobuf>=3.6.1,<3.20?

@HaloWorld HaloWorld changed the title [Python] Support for validate_all function Python: Support for validate_all function Jul 12, 2022
@elliotmjackson elliotmjackson added Enhancement Extend or improve functionality Python Python language support labels Sep 15, 2022
Copy link
Contributor

@elliotmjackson elliotmjackson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @HaloWorld, I'm quite happy to get this into the main branch for release - i would like to see this put through the test harness if possible

@CLAassistant
Copy link

CLAassistant commented Oct 6, 2022

CLA assistant check
All committers have signed the CLA.

@HaloWorld
Copy link
Contributor Author

A check for validate_all has been added to harness, which I have tested locally and passed, and currently has two checks:

  1. Ensure that the result of validate_all is the same as validate
  2. When validation fails, ensure that the first error message of validate_all is the same as the error message of validate

Also, since the ast.unparse function was not officially introduced until python 3.9, I had to bring in a third-party library (astunparse) to ensure compatibility.

@HaloWorld
Copy link
Contributor Author

HaloWorld commented Oct 7, 2022

Here is an example of the functions currently generated by validate_all.

proto(tests/harness/cases/kitchen_sink.proto)

syntax = "proto3";

package tests.harness.cases;
option go_package = "github.com/envoyproxy/protoc-gen-validate/tests/harness/cases/go;cases";
import "validate/validate.proto";

import "google/protobuf/wrappers.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/any.proto";

enum ComplexTestEnum {
    ComplexZero = 0;
    ComplexONE  = 1;
    ComplexTWO  = 2;
}

message ComplexTestMsg {
    string                              const  = 1 [(validate.rules).string.const = "abcd"];
    ComplexTestMsg                      nested = 2;
    int32                               int_const = 3 [(validate.rules).int32.const = 5];
    bool                                bool_const = 4 [(validate.rules).bool.const = false];
    google.protobuf.FloatValue          float_val = 5 [(validate.rules).float.gt = 0];
    google.protobuf.Duration            dur_val = 6 [(validate.rules).duration.lt = {seconds: 17}, (validate.rules).duration.required = true];
    google.protobuf.Timestamp           ts_val = 7 [(validate.rules).timestamp.gt = {seconds: 7}];
    ComplexTestMsg                      another = 8;
    float                               float_const = 9 [(validate.rules).float.lt = 8];
    double                              double_in = 10 [(validate.rules).double = {in: [456.789, 123]}];
    ComplexTestEnum                     enum_const = 11 [(validate.rules).enum.const = 2];
    google.protobuf.Any                 any_val = 12 [(validate.rules).any = {in: ["type.googleapis.com/google.protobuf.Duration"]}];
    repeated google.protobuf.Timestamp  rep_ts_val = 13 [(validate.rules).repeated = { items { timestamp { gte { nanos: 1000000}}}}];
    map<sint32, string>                 map_val = 14 [(validate.rules).map.keys.sint32.lt = 0];
    bytes                               bytes_val = 15 [(validate.rules).bytes.const = "\x00\x99"];
    oneof o {
        option (validate.required) = true;

        string       x = 16;
        int32        y = 17;
    }
}

message KitchenSinkMessage { ComplexTestMsg val = 1; }

functions genereated by validate

# Validates KitchenSinkMessage
def generate_validate(p):
    if _has_field(p, "val"):
        embedded = validate(p.val)
        if embedded is not None:
            return embedded
    return None
# Validates ComplexTestMsg
def generate_validate(p):
    present = False
    if _has_field(p, "x"):
        present = True
    if _has_field(p, "y"):
        present = True
    if not present:
        raise ValidationFailed("Oneof o is required")
    if p.const != "abcd":
        raise ValidationFailed("p.const not equal to abcd")
    if _has_field(p, "nested"):
        embedded = validate(p.nested)
        if embedded is not None:
            return embedded
    if p.int_const != 5:
        raise ValidationFailed("p.int_const not equal to 5")
    if p.bool_const != False:
        raise ValidationFailed("p.bool_const not equal to False")
    if p.HasField("float_val"):
        if p.float_val.value <= 0.0:
            raise ValidationFailed("p.float_val.value is not greater than 0.0")
    if not _has_field(p, "dur_val"):
        raise ValidationFailed("p.dur_val is required.")
    if _has_field(p, "dur_val"):
        dur = p.dur_val.seconds + round((10**-9 * p.dur_val.nanos), 9)
        lt = 17.0
        if dur >= lt:
            raise ValidationFailed("p.dur_val is not lesser than 17.0")
    if _has_field(p, "ts_val"):
        ts = p.ts_val.seconds + round((10**-9 * p.ts_val.nanos), 9)
        gt = 7.0
        if ts <= gt:
            raise ValidationFailed("p.ts_val is not greater than 7.0")
    if _has_field(p, "another"):
        embedded = validate(p.another)
        if embedded is not None:
            return embedded
    if p.float_const >= 8.0:
        raise ValidationFailed("p.float_const is not lesser than 8.0")
    if p.double_in not in [456.789, 123.0]:
        raise ValidationFailed("p.double_in not in [456.789, 123.0]")
    if p.enum_const != 2:
        raise ValidationFailed("p.enum_const not equal to 2")
    if _has_field(p, "any_val"):
        if p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']:
            raise ValidationFailed("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        validate(item)
    for item in p.rep_ts_val:    
        if item:
            ts = item.seconds + round((10**-9 * item.nanos), 9)
            gte = 0.001
            if ts < gte:
                raise ValidationFailed("item is not greater than or equal to 0.001")
        pass
    for key in p.map_val:    
        if key >= 0:
            raise ValidationFailed("key is not lesser than 0")
        pass
    if p.bytes_val != b'\x00\x99':
        raise ValidationFailed("p.bytes_val not equal to b'\x00\x99'")
    return None

functions generated by validate_all

# Validates KitchenSinkMessage All
def generate_validate_all(p):
    err = []
    if _has_field(p, 'val'):
        err += _validate_all(p.val)
    return err
# Validates ComplexTestMsg All
def generate_validate_all(p):
    err = []
    present = False
    if _has_field(p, 'x'):
        present = True
    if _has_field(p, 'y'):
        present = True
    if (not present):
        err.append('Oneof o is required')
    if (p.const != 'abcd'):
        err.append('p.const not equal to abcd')
    if _has_field(p, 'nested'):
        err += _validate_all(p.nested)
    if (p.int_const != 5):
        err.append('p.int_const not equal to 5')
    if (p.bool_const != False):
        err.append('p.bool_const not equal to False')
    if p.HasField('float_val'):
        if (p.float_val.value <= 0.0):
            err.append('p.float_val.value is not greater than 0.0')
    if (not _has_field(p, 'dur_val')):
        err.append('p.dur_val is required.')
    if _has_field(p, 'dur_val'):
        dur = (p.dur_val.seconds + round(((10 ** (- 9)) * p.dur_val.nanos), 9))
        lt = 17.0
        if (dur >= lt):
            err.append('p.dur_val is not lesser than 17.0')
    if _has_field(p, 'ts_val'):
        ts = (p.ts_val.seconds + round(((10 ** (- 9)) * p.ts_val.nanos), 9))
        gt = 7.0
        if (ts <= gt):
            err.append('p.ts_val is not greater than 7.0')
    if _has_field(p, 'another'):
        err += _validate_all(p.another)
    if (p.float_const >= 8.0):
        err.append('p.float_const is not lesser than 8.0')
    if (p.double_in not in [456.789, 123.0]):
        err.append('p.double_in not in [456.789, 123.0]')
    if (p.enum_const != 2):
        err.append('p.enum_const not equal to 2')
    if _has_field(p, 'any_val'):
        if (p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']):
            err.append("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        err += _validate_all(item)
    for item in p.rep_ts_val:
        if item:
            ts = (item.seconds + round(((10 ** (- 9)) * item.nanos), 9))
            gte = 0.001
            if (ts < gte):
                err.append('item is not greater than or equal to 0.001')
        pass
    for key in p.map_val:
        if (key >= 0):
            err.append('key is not lesser than 0')
        pass
    if (p.bytes_val != b'\x00\x99'):
        err.append("p.bytes_val not equal to b'\x00\x99'")
    return err

Copy link
Contributor Author

@HaloWorld HaloWorld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check for validate_all has been added to harness, which I have tested locally and passed, and currently has two checks:

  • Ensure that the result of validate_all is the same as validate
  • When validation fails, ensure that the first error message of validate_all is the same as the error message of validate

Also, since the ast.unparse function was not officially introduced until python 3.9, I had to bring in a third-party library (astunparse) to ensure compatibility.

@elliotmjackson
Copy link
Contributor

@HaloWorld, CI is unhappy with

ERROR: /go/src/github.com/envoyproxy/protoc-gen-validate/python/protoc_gen_validate/validator.py Imports are incorrectly sorted and/or formatted.

I'm interested in improving the dev experience on this repo, just out of interest did this not get picked up on your local?

Copy link
Contributor Author

@HaloWorld HaloWorld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix isort lint

@HaloWorld
Copy link
Contributor Author

Sorry, I was previously unaware to the point that I could run ci directly locally with the make ci command, so I missed the lint check. Now I have fixed this issue and passed the ci test locally.

I'm relying on the Development section in README, I don't have bazel locally and I don't want to install bazel, so the only option for me is to build a docker image for testing.

docker build -t lyft/protoc-gen-validate .

Then I get to the container shell with the following command:

docker run --rm -ti -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then I ran the make build command and got the following result:
build

I followed the prompt and added the environment variable GOFLAGS="-buildvcs=false" with the following command:

docker run --rm -ti -e GOFLAGS="-buildvcs=false" -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then ran make build again, after which I ran the python tests with:

bazel test //tests/harness/executor:executor_python_test

image

After the test passed, I committed and pushed the code.


There are two things that I find most troublesome in the whole development process.

  1. finding the entry for python test. I am not very familiar with bazel and go syntax, so this took me some time. Maybe explain how to run different tests for developers of different languages? If I just modified the python code, I don't think running the whole test with bazel is necessary, even if I need to run the whole test, I can run it before committing the PR.

  2. add test cases against validate_all. Honestly, the current test set is not complete, it only guarantees that validate_all behaves the same as validate, but it does not guarantee that validate_all can find all validation exceptions. I know that the go version of PGV implements validateAll, but I can only see the first error message compared in the go test code(harnesss.go):

Signed-off-by: puyj <pyjapple@gmail.com>
Signed-off-by: puyj <pyjapple@gmail.com>
* Refactor _Transformer to make the code structure more intuitive

Split `_Transformer` into 7 sub-Transformers (in execution order).

When the python version is lower than 3.9, use the `unparse` function of the third-party library `astunparse` to replace the `unparse` function of the standard library `ast`
* Add test for validate_all to Python harness

The result of `validate` is guaranteed to be the same as the result of `validate_all`.

When validation fails, the error message of `validate` is guaranteed to be the same as the first error message of `validate_all`.
Copy link
Contributor

@elliotmjackson elliotmjackson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for your contribution @HaloWorld

@elliotmjackson elliotmjackson merged commit 65fff73 into bufbuild:main Oct 25, 2022
elliotmjackson added a commit that referenced this pull request Oct 26, 2022
Highlighted by our valued contributor @HaloWorld in #606, the
`development` section of our readme needs some changes. This really does
need a lot more work but should do the trick for now.
@dkunitsk
Copy link
Contributor

Wanted to chime in to say thank you. This is a great change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Extend or improve functionality Python Python language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants