Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

built-ins: add object.subset() builtin #4751

Merged

Conversation

charlesdaniels
Copy link
Contributor

This implements the new object.subset() builtin.

Fixes #4358

Signed-off-by: Charles Daniels charles@styra.com

@charlesdaniels
Copy link
Contributor Author

I have created a simple benchmark in topdown/topdown_bench_test which compares the new builtin against the code @anderseknert wrote as an example in #4358. It looks like this is roughly a 50% speedup. Testing environment was a 2021 14 inch MacBook pro with an Apple M1 Pro and 32GB RAM, running macOS 12.3.1, with Go version 1.17.6 darwin/arm64. The benchmark output is as below:

$ go test -bench BenchmarkObjectSubset ./topdown/
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61763: remote error: tls: bad certificate
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61961: remote error: tls: bad certificate
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61962: remote error: tls: bad certificate
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61963: remote error: tls: bad certificate
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61964: remote error: tls: bad certificate
2022/06/07 17:32:25 http: TLS handshake error from 127.0.0.1:61976: remote error: tls: bad certificate
goos: darwin
goarch: arm64
pkg: github.com/open-policy-agent/opa/topdown
BenchmarkObjectSubset/10-8    	   60330	     19301 ns/op
BenchmarkObjectSubset/100-8   	    9763	    123669 ns/op
BenchmarkObjectSubset/1000-8  	     895	   1330247 ns/op
BenchmarkObjectSubset/10000-8 	      64	  18523701 ns/op
BenchmarkObjectSubsetSlow/10-8         	   28956	     43153 ns/op
BenchmarkObjectSubsetSlow/100-8        	    3853	    307680 ns/op
BenchmarkObjectSubsetSlow/1000-8       	     376	   3243462 ns/op
BenchmarkObjectSubsetSlow/10000-8      	      32	  36435461 ns/op
PASS
ok  	github.com/open-policy-agent/opa/topdown	13.568s

The "slow" tests are done using pure Rego, while the other tests use the new builtin.

@charlesdaniels charlesdaniels marked this pull request as draft June 7, 2022 21:36
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just nitpicks.

I'm working on getting the metadata change related to the last release fixed here, #4753, so if you could rebase, I'd appreciate that. Running make generate should be enough to have only the needed sections added.

ast/builtins.go Outdated Show resolved Hide resolved
topdown/subset.go Outdated Show resolved Hide resolved
@charlesdaniels
Copy link
Contributor Author

charlesdaniels commented Jun 8, 2022

In bbb127862c8b6a8fb32303a55c2fef5531b43a13, I have spent some time with the profiler and have made some low hanging fruit optimizations. I am reasonably confident that this doesn't leave performance on the floor, except for array subset using an O(n^2) naive algorithm rather than Boyer-Moore or something similar.

The new speedup over pure Rego is 2.77x, compared to 1.97x with the initial version of this PR.

@charlesdaniels charlesdaniels marked this pull request as ready for review June 8, 2022 18:34
srenatus
srenatus previously approved these changes Jun 9, 2022
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with me. The tiniest nitpicks only 🙃 👇

topdown/subset.go Outdated Show resolved Hide resolved
topdown/subset.go Outdated Show resolved Hide resolved
topdown/subset.go Show resolved Hide resolved
srenatus
srenatus previously approved these changes Jun 9, 2022
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM

This implements the new object.subset() builtin.

Based on my benchmarking, this offers a 2.77x speedup compared to
implementing the same thing in pure Rego, and is also easier to read.

Fixes open-policy-agent#4358

Signed-off-by: Charles Daniels <charles@styra.com>
@charlesdaniels
Copy link
Contributor Author

No changes, just need rubber stamp after rebasing.

@ashutosh-narkar ashutosh-narkar merged commit f2fd8f5 into open-policy-agent:main Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide built-in function (or other mean) to check if collection x includes collection y
4 participants