Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor protoparse, fixes multiple issues #316

Merged
merged 2 commits into from Apr 27, 2020
Merged

Conversation

jhump
Copy link
Owner

@jhump jhump commented Apr 26, 2020

  • defers a lot of validation to after the AST is fully constructed
  • in particular, defers tag validation to after message options are parsed in order to know if it has message_set_wire_format option, which impacts allowed tag range
  • fixes issues with very large constant numbers (that overflow uint64 or underflow int64)
  • refactors grammar around option names and fixes issue where extensions on message options aren't parsed correctly
  • fixes issues related to groups in oneofs
  • fixes issues with reserved name validation
  • adds some util methods to shrink the boiler-plate for error creation and error handling (errorWithPos(...), (*errorHandler) handleErrWithPos(...))
  • breaks up monolithic parser.go into three files: parser.go, validation.go, and descriptor_protos.go
  • adds numerous new validation test cases to catch various issues that were fixed

Fixes #275, #307, #308, #312, #313, #314, #315

// MaxTag is the maximum allowed tag number for a field.
MaxTag = 536870911 // 2^29 - 1
// MaxNormalTag is the maximum allowed tag number for a field in a normal message.
MaxNormalTag = 536870911 // 2^29 - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I might switch this from MaxNormalTag/MaxTag to MaxTag/MaxTagMessageSetWireFormat since the message set max tag is the edge case, but more a matter of preference

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted a name like MaxTag to be the actual maximum value -- the greater of the two -- because I thought that would be more clear.

Maybe MaxNormalTag, MaxMessageSetTag, and MaxTag = MaxMessageSetTag?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that works! Or keep as is, either way :-)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to adding GetMaxTag(messageSet bool), I also left these other constants, but tweaked them per discussion above.

@jhump jhump force-pushed the jh/fix-messageset-tag-range branch from 9e5baef to a643a41 Compare April 26, 2020 04:25
- defers a lot of validation to after the AST is fully constructed
- in particular, defers tag validation to after message options are parsed
  in order to know if it has message_set_wire_format option, which impacts
  allowed tag range
- fixes issues with very large constant numbers (that overflow uint64 or
  underflow int64)
- refactors grammar around option names and fixes issue where extensions
  on message options aren't parsed correctly
- fixes issues related to groups in oneofs
- fixes issues with reserved name validation
- adds some util methods to shrink the boiler-plate for error creation
  and error handling
- breaks up monolithic parser.go into three files: parser.go, validation.go
  and descriptor_protos.go
- adds numerous new validation test cases to catch various issues that
  were fixed
@jhump jhump force-pushed the jh/fix-messageset-tag-range branch from a643a41 to 31226ba Compare April 26, 2020 04:38
Copy link
Contributor

@bufdev bufdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a light pass - all nits effectively, nothing important. Figured a second look on this couldn't hurt.

@@ -508,6 +508,11 @@ func (flb *FieldBuilder) buildProto(path []int32, sourceInfo *dpb.SourceCodeInfo
def = proto.String(flb.Default)
}

// normal (non-messageset) messages have lower tag limit, so enforce that here
if !isMessageSet && flb.number > internal.MaxNormalTag {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to enforce the isMessageSet case as well here? I'm sure you've thought of this and there's likely a reason you're not, but just checking.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code already validates that tags are never > internal.MaxTag when fields are created. This is a deferred check just for the tighter range imposed on non-messageset fields, which can't be performed until we know the field's context (e.g. is it part of a messageset or not).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to just use internal.GetMaxTag so it's simpler to follow along

@@ -729,17 +761,46 @@ type extensionRangeNode struct {
type rangeNode struct {
basicCompositeNode
stNode, enNode node
st, en int32
enMax bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: endMax?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not sure why I used st and en instead of start and end. I can do a quick rename for this and the two fields above.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

dpb "github.com/golang/protobuf/protoc-gen-go/descriptor"
)

func basicValidate(res *parseResult, containsErrors bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: validateBasic?

Copy link
Owner Author

@jhump jhump Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure... this was already called this FWIW (just moved from parser.go into its own file).

Copy link
Contributor

@bufdev bufdev Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea, I was going to add the caveat that this might already have been there - people commenting on something that was just moved around as part of a refactoring PR is a pet peeve of mine, so feel free to ignore this :-)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

@@ -716,6 +716,10 @@ func (p *Printer) printMessageBody(md *desc.MessageDescriptor, mf *dynamic.Messa
}

skip := map[interface{}]bool{}
maxTag := int32(internal.MaxNormalTag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit: You might want to turn this into a shared helper function GetMaxTag(isMessageSet bool) instead of having the public consts

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jhump
Copy link
Owner Author

jhump commented Apr 27, 2020

Travis has been failing to report build status quite a lot lately :(

Build is good: https://travis-ci.org/github/jhump/protoreflect/builds/679896778

Here goes nothing'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

difference to protoc: tag number validation overly strict
2 participants