Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

union() is slower than pure Rego, can be up to O(n^3) #4979

Closed
charlesdaniels opened this issue Aug 8, 2022 · 1 comment · Fixed by #4980
Closed

union() is slower than pure Rego, can be up to O(n^3) #4979

charlesdaniels opened this issue Aug 8, 2022 · 1 comment · Fixed by #4980
Assignees
Labels

Comments

@charlesdaniels
Copy link
Contributor

charlesdaniels commented Aug 8, 2022

union() is very slow in Rego because of how it is implemented currently.

Consider the implementation of the union builtin:

// builtinSetUnion returns the union of the given input sets
func builtinSetUnion(a ast.Value) (ast.Value, error) {

	inputSet, err := builtins.SetOperand(a, 1)
	if err != nil {
		return nil, err
	}

	result := ast.NewSet()

	err = inputSet.Iter(func(x *ast.Term) error {
		n, err := builtins.SetOperand(x.Value, 1)
		if err != nil {
			return err
		}
		result = result.Union(n)
		return nil
	})
	return result, err
}

And also ast.set.Union():

// Union returns the set containing all elements of s and other.
func (s *set) Union(other Set) Set {
	r := NewSet()
	s.Foreach(func(x *Term) {
		r.Add(x)
	})
	other.Foreach(func(x *Term) {
		r.Add(x)
	})
	return r
}

If we have $n$ many sets to union, of size $k$ each, then that's going to result in $n$ many calls to ast.set.Union(), except that each call to ast.set.Union() is going to end up creating a whole new set and copying all of the existing elements over to it.

This means that each call to ast.set.Union() takes increasingly more time, specifically $O(k), O(2k), O(3k), \dots$.
This means the running time is O(k * nth triangle number) = $O\left( \frac{(n+1)!}{2!(n-2)!} \cdot k\right) = O\left(\frac{1}{2} k n (n^2-1)\right) \approx O(kn^3)$.

Empirical Benchmarks

With the union() builtin:

$ cat slowunion/slowunion.rego
package slowunion

SIZE := 250
NSETS := 250

nums := numbers.range(0, NSETS)
sizes := numbers.range(0, SIZE)

sets[n] = x {
	nums[n]
	x := {sprintf("%d,%d", [n, i]) | sizes[i]}
}

combined := union({s | s := sets[_]})
$ time opa eval --profile --format pretty --bundle ./slowunion/ --import "data.slowunion" 'count(data.slowunion.combined)'
63001
+------------------------------+------------+
|            METRIC            |   VALUE    |
+------------------------------+------------+
| timer_rego_load_bundles_ns   | 583584     |
| timer_rego_module_compile_ns | 730417     |
| timer_rego_module_parse_ns   | 162458     |
| timer_rego_query_compile_ns  | 52125      |
| timer_rego_query_eval_ns     | 2673904333 |
| timer_rego_query_parse_ns    | 29625      |
+------------------------------+------------+
+--------------+----------+----------+-----------------------------+
|     TIME     | NUM EVAL | NUM REDO |          LOCATION           |
+--------------+----------+----------+-----------------------------+
| 2.543266402s | 4        | 254      | slowunion/slowunion.rego:14 |
| 115.799818ms | 63503    | 126253   | slowunion/slowunion.rego:11 |
| 3.801167ms   | 2        | 2        | data.slowunion.combined     |
| 152.332µs    | 3        | 3        | slowunion/slowunion.rego:6  |
| 119.959µs    | 1        | 251      | slowunion/slowunion.rego:10 |
| 95.999µs     | 3        | 3        | slowunion/slowunion.rego:7  |
| 4.624µs      | 1        | 1        | slowunion/slowunion.rego:4  |
| 2.125µs      | 1        | 1        | slowunion/slowunion.rego:3  |
+--------------+----------+----------+-----------------------------+

real	0m2.699s
user	0m3.573s
sys	0m0.137s

With pure Rego instead:

$ cat slowunion/slowunion.rego
package slowunion

SIZE := 250
NSETS := 250

nums := numbers.range(0, NSETS)
sizes := numbers.range(0, SIZE)

sets[n] = x {
	nums[n]
	x := {sprintf("%d,%d", [n, i]) | sizes[i]}
}

#combined := union({s | s := sets[_]})
combined := {t | s := sets[_]; s[t]}
$ time opa eval --profile --format pretty --bundle ./slowunion/ --import "data.slowunion" 'count(data.slowunion.combined)'
63001
+------------------------------+-----------+
|            METRIC            |   VALUE   |
+------------------------------+-----------+
| timer_rego_load_bundles_ns   | 538083    |
| timer_rego_module_compile_ns | 725375    |
| timer_rego_module_parse_ns   | 176458    |
| timer_rego_query_compile_ns  | 49208     |
| timer_rego_query_eval_ns     | 327751125 |
| timer_rego_query_parse_ns    | 31375     |
+------------------------------+-----------+
+--------------+----------+----------+-----------------------------+
|     TIME     | NUM EVAL | NUM REDO |          LOCATION           |
+--------------+----------+----------+-----------------------------+
| 195.567117ms | 254      | 63254    | slowunion/slowunion.rego:15 |
| 114.836799ms | 63503    | 126253   | slowunion/slowunion.rego:11 |
| 2.148999ms   | 2        | 2        | data.slowunion.combined     |
| 160.375µs    | 3        | 3        | slowunion/slowunion.rego:6  |
| 117.471µs    | 1        | 251      | slowunion/slowunion.rego:10 |
| 110.126µs    | 3        | 3        | slowunion/slowunion.rego:7  |
| 10.416µs     | 1        | 1        | slowunion/slowunion.rego:4  |
| 2.084µs      | 1        | 1        | slowunion/slowunion.rego:3  |
+--------------+----------+----------+-----------------------------+

real	0m0.354s
user	0m0.426s
sys	0m0.029s

System info:

$ uname -a
Darwin styra-macbook.local 21.5.0 Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:37 PDT 2022; root:xnu-8020.121.3~4/RELEASE_ARM64_T6000 arm64
$ system_profiler SPSoftwareDataType SPHardwareDataType

Software:

    System Software Overview:

      System Version: macOS 12.4 (21F79)
      Kernel Version: Darwin 21.5.0
      Boot Volume: Macintosh HD
      Boot Mode: Normal
      Computer Name: styra-macbook
      User Name: Charles Daniels (cad)
      Secure Virtual Memory: Enabled
      System Integrity Protection: Enabled
      Time since boot: 5 days 15:17

Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro18,3
      Chip: Apple M1 Pro
      Total Number of Cores: 8 (6 performance and 2 efficiency)
      Memory: 32 GB
      System Firmware Version: 7459.121.3
      OS Loader Version: 7459.121.3
      Serial Number (system): REDACTED
      Hardware UUID: REDACTED
      Provisioning UDID: REDACTED
      Activation Lock Status: Disabled

$ opa version
Version: 0.43.0
Build Commit: d75bbdd
Build Timestamp: 2022-07-29T21:15:42Z
Build Hostname: Mac-1659129125033.local
Go Version: go1.18.4
Platform: darwin/arm64
WebAssembly: unavailable
@philipaconrad philipaconrad self-assigned this Aug 8, 2022
@philipaconrad
Copy link
Contributor

@charlesdaniels Thank you for reporting this, and for the detailed analysis/example code. 😄

I think for now the easiest way to boost performance for this builtin would be to inline a chunk of the set unioning logic into the builtin's definition. That could get rid of a whole level of wasted allocations, since we'd not be throwing away temporary sets along the way.

philipaconrad added a commit that referenced this issue Aug 11, 2022
This commit fixes a performance regression for the Set `union` builtin, discovered in issue #4979.

The original logic for the builtin did pairwise `Set.Union` calls between the input sets, resulting in many wasted temporary Sets that were almost immediately discarded, and a bunch of duplicated work. The improved builtin inlines the logic from `Set.Union`, so that only one pass is made across the incoming sets' members.

Fixes #4979

Signed-off-by: Philip Conrad <philipaconrad@gmail.com>
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Sep 13, 2022
Changes:
## 0.44.0

This release contains a number of fixes, two new builtins, a few new features,
and several performance improvements.

### Security Fixes

This release includes the security fixes present in the recent v0.43.1 release,
which mitigate CVE-2022-36085.

See the Release Notes for v0.43.1 for more details.

### Set Element Addition Optimization

Rego Set element addition operations did not scale linearly in the
past, and like the Object type before v0.43.0, experienced noticeable
reallocation/memory movement overheads once the Set grew past 120k-150k
elements in size.

This release introduces different handling of Set internals during element
addition operations to avoid pathological reallocation behavior, and allows
linear performance scaling up into the 500k key range and beyond.

### Set `union` Built-in Optimization

The Set `union` builtin allows applying the union operation to a set of sets.

However, as discovered in
<open-policy-agent/opa#4979>, its
implementation generated unnecessary intermediate copies, which
resulted in poor performance; in many cases, worse than writing the
equivalent operation in pure Rego.

This release improves the `union` builtin's implementation, such that
only the final result set is ever modified, reducing memory allocations
and GC pressure.  The `union` builtin is now about 15-30% faster than
the equivalent operation in pure Rego.

### New Built-in Functions: `strings.any_prefix_match` and `strings.any_suffix_match`

This release introduces two new builtins, optimized for bulk matching of string
prefixes and suffixes: `strings.any_prefix_match`, and
`strings.any_suffix_match`.
It works with sets and arrays of strings, allowing efficient matching of
collections of prefixes or suffixes against a target string.

See the built-in functions docs for all the details:

 <https://www.openpolicyagent.org/docs/v0.42.0/policy-reference/#builtin-strings-stringsany_prefix_match>


## 0.43.1

This is a security release fixing the following vulnerabilities:

- CVE-2022-36085: Respect unsafeBuiltinMap for 'with' replacements in the compiler

  See <GHSA-f524-rf33-2jjr>
  for all details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants