Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy metadata associated with a node in the optimized result #6593

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

ashutosh-narkar
Copy link
Member

🚧 Fixes: #6529

//# scope: document
//# schemas:
//# - input.foo: schema.string
//p { input.foo = "str" }`}},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented out this test. Since we now build the annotation set during module parsing, this test fails during the parse phase. I guess the question is should a module with incorrect metadata should pass module parsing when say ProcessAnnotation is true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've uncommented this test case. I've updated how the annotations are attached to the rule w/o building the annotation set during module parsing.

ast/parser_ext.go Outdated Show resolved Hide resolved
Comment on lines 1191 to 1193

# METADATA
# entrypoint: true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have been omitted in the optimized module since the optimized package was an entrypoint?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, getting an extra entrypoint at test.p.foo isn't ideal, but I don't think it's a deal-breaker. What's likely worse is that we've dropped the probably expected test.p entrypoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's easily achieved, but maybe we could make it so that we don't shift path components from the rule to the package if there are annotations on the rule. I.e. we should produce the support module:

package test


# METADATA
# entrypoint: true
p.foo contains __local1__1 if {
	__local1__1 = input.v
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code so that we no longer add an extra entrypoint. We currently drop the existing entrypoint. I'll see if we can change that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are maintaining current behavior here in terms of not annotating a shifted optimized entrypoint. Can we handle that as part of a separate change? We can probably add a note in the docs for now. WDYT @johanfylling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Would be good if we could inform the user about this, somehow. We don't have much in ways of logging when building a bundle, right?
Maybe we could leave a comment in the original source file if we've dropped an entrypoint. E.g.:

# NOTE: Entrypoint foo/bar/baz has been dropped due to build optimization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good if we could inform the user about this, somehow. We don't have much in ways of logging when building a bundle, right?

We already do that if you run opa build with the --debug flag. So for the above policy we log a message like

optimizer: entrypoint: data.test.p: discard due to self-reference

WDYT?

@@ -908,6 +967,8 @@ func (o *optimizer) Do(ctx context.Context) error {
// because otherwise the optimization outputs (e.g., support rules) would have to
// merged somehow. Instead of dealing with that, just run the optimizations in the
// order the user supplied the entrypoints in.

var flattenedAnnotations []*ast.AnnotationsRef
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check the optimization level maybe and only do this for O=1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ok to do this w/o any explicit check.


if len(existingAnnotations) == 0 {
for _, an := range rule.Annotations {
w.blankLine()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will sometimes add an additional blank line. From this part in the code it's not possible to figure out what the previous line is unless we read the buffer.

topdown/eval.go Outdated Show resolved Hide resolved
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the chosen approach is sound 👍 .
I had expected attaching annotations to rules (and packages) would simplify annotation/comment processing in places, but maybe that's not fully possible without breaking changes. E.g. we could maybe get rid of the module's annotations, but that would be a breaking change.

rule.Annotations = append(rule.Annotations, a)
j = i
found = true
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a rule has multiple sets of annotations, e.g. rule- and document-scope? Aren't all but the first dropped, to possibly be attached to the next rule?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should add a test for this function, asserting a couple of edge cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this logic and added test cases.

ast/annotations.go Outdated Show resolved Hide resolved
ast/compile.go Show resolved Hide resolved
ast/policy.go Outdated
Body Body `json:"body"`
Else *Rule `json:"else,omitempty"`
Location *Location `json:"location,omitempty"`
Annotations []*Annotations `json:"-"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would exposing the annotations in the AST-json (partially) solve #6280?
@anderseknert, any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seems like a good opportunity to kill two birds with one stone 👍 We could always gate it behind one of the many JSON options we have available now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're exposing the annotations in the AST-json.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

Comment on lines 1191 to 1193

# METADATA
# entrypoint: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, getting an extra entrypoint at test.p.foo isn't ideal, but I don't think it's a deal-breaker. What's likely worse is that we've dropped the probably expected test.p entrypoint.

cmd/build_test.go Outdated Show resolved Hide resolved
for _, rule := range mf.Parsed.Rules {
if p.Equal(rule.Ref()) {
annotations = append(annotations, annotation)
found = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rule might have been dropped from the bundle through optimization here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand. We run this on the optimized bundle. Can you please elaborate?

comments = w.insertComments(comments, pkg.Location)

w.startLine()

if len(parsedAnnotations) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When parsing annotations, corresponding comments are dropped?
There is no risk of (len(annotations) > 0 && len(parsedAnnotations) > 0) == true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When parsing annotations, corresponding comments are dropped?

While parsing the annotations the corresponding comments are NOT dropped. In cases where the policies are optimized, we now attach the annotations to the optimized rule and module AST. But in this case there are NO comments on the optimized module (ie. the comments from the original module are not carried over to the optimized module but NOW annotations are). So in my testing I haven't come across this condition (len(annotations) > 0 && len(parsedAnnotations) > 0) == true.

Can you imagine a scenario where this is likely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since the support module should never have any comments, this should be fine, I think 👍.

w.writeComments(an.Comments())
}
}

comments = w.insertComments(comments, rule.Location)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe annotation parsing will break if there isn't an empty line separating annotations from trailing comments. w.insertComments() ensures there is a blank line between the comments and any annotations written above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I guess we can check for trailing comments and add a blank line after writing out the annotations. Updated the code.

topdown/eval.go Outdated Show resolved Hide resolved
@ashutosh-narkar ashutosh-narkar force-pushed the move-metadata-optimize branch 2 times, most recently from 4badf68 to 9f701a7 Compare February 22, 2024 23:28
@ashutosh-narkar ashutosh-narkar changed the title WIP: Copy metadata associated with a node in the optimized result Copy metadata associated with a node in the optimized result Feb 27, 2024
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few thoughts/comments.

var found bool
for i, a := range cpy {
if rule.Loc().Row > a.Location.Row {
if rule.Ref().Equal(a.GetTargetPath()) && (a.Scope == annotationScopeRule || a.Scope == annotationScopeDocument) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we require the annotation's node ref to match the rule's ref, is there actually a need in asserting the annotation's scope, as we already know it belongs to the rule? If we skipped this check, we'd even be a bit more future proof here if we ever were to add new scopes (albeit unlikely).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right we probably don't need that check. Updated code.

if rule.Ref().Equal(a.GetTargetPath()) && (a.Scope == annotationScopeRule || a.Scope == annotationScopeDocument) {
rule.Annotations = append(rule.Annotations, a)

if a.Scope == annotationScopeRule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get this right?:

  • if there is a document-scoped annotation block, and it's declared above a rule-scope annotation block, it'll get removed from cpy.
  • if it's instead declared after all rule-scoped annotations, it'll be left in cpy?

This is probably not an actual issue, as we also match the rule ref and annotation target path, so any straggling document-scoped annotations won't be attached to any succeeding rules with unmatched ref/path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anderseknert, is there a regal rule for having multiple rule-scope annotation blocks for a rule? If not, might be a good candidate for a new linting rule, as this is a likely user mistake.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is a document-scoped annotation block, and it's declared above a rule-scope annotation block, it'll get removed from cpy.
if it's instead declared after all rule-scoped annotations, it'll be left in cpy?
This is probably not an actual issue, as we also match the rule ref and annotation target path, so any straggling document-scoped annotations won't be attached to any succeeding rules with unmatched ref/path.

My thinking here is: We attach a rule-scope annotation block that precedes the rule to that specific rule. That annotation will NOT apply to any other rule. So we remove it from cpy. We keep the document-scope annotations as they could apply to other rules as well. Does this make sense?

}
}

func (a *Annotations) GetRule() *Rule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there situations where a == nil? attachRuleAnnotations() seems to rely on this fact. Could we simplify that function by simply iterating over all the module's annotations, call GetRule() on them, and attach the annotation to any non-nil return value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, never mind. I just realized we need to attach document-scoped annotations to all rules with the same path, and n would only represent one of those.

I suppose we have a way to make sure that we don't emit the document-scope annotation block for each moved rule when rewriting 🤔 . Or maybe that makes little difference, as they'll be identical anyways ..

a2 := []*Annotations{
{
Description: "doc",
Scope: "document",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, this is why we don't remove document-scope annotations from cpy in attachRuleAnnotations() 👍 .

import rego.v1

# METADATA
# scope: document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be completionists, should we add tests for when the doc-scope annotations are preceded by rule-scope annotations for the same rule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case for this.

}
}

if module := o.getSupportForEntrypoint(pq.Queries, e, resultsym, flattenedAnnotations); module != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did the call to getSupportForEntrypoint() before the above, would we still need to attach annotations inside that function? Or would reordering screw something up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotations wouldn't be attached to the new support rule, though .. so perhaps that's not an option 🤔.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotations wouldn't be attached to the new support rule, though

Yes.

rule := &ast.Rule{ // TODO(sr): use RefHead instead?
Head: ast.NewHead(name, nil, resultsym),
Body: query,
Annotations: ruleAnnotations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of homogeneity, should we attach the support rule to the annotations' node pointer here?

"optimized/test.rego": `
package test

foo = __result__ { __result__ = {"bar": {"p": true}} }`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we wouldn't do this kind of reformatting if the rule has annotations, right? We should perhaps create a ticket for fixing this.

# description: r
r = true { t with q as {3} }`,
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness sake, we should add tests for the document and subpackages scopes too.

I wonder if the subpackages scope will cause problems if it's emitted for each new support module, as that might cause a redeclaration error .. 🤔.
That might however only be the case if the annotations differ. I forget.

comments = w.insertComments(comments, pkg.Location)

w.startLine()

if len(parsedAnnotations) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since the support module should never have any comments, this should be fine, I think 👍.

Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Copy link

stale bot commented Apr 10, 2024

This pull request has been automatically marked as stale because it has not had any activity in the last 30 days.

@stale stale bot added the inactive label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Building bundle with optimization breaks metadata annotation parsing
3 participants