New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bulk prefix and suffix string collection matching. #4997
Add bulk prefix and suffix string collection matching. #4997
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having written a few "starts with any" functions in Rego in the past, this is a welcome addition 👍 Will sleep on the naming question 😄
@anderseknert Please take another look. I've refactored, added tests, and added benchmarks showing the improvement. We've also done an internal poll to choose a slightly better name which should be more intuitive than the original proposal (though it's still not a strong opinion, happy to change it if you have a better idea for it). |
Looks like some good progress has been made here! 😃 I'm a little confused by the current names though -
This would not only solve the woes around naming :P but might be a logical place for this functionality? I don't know for certain yet, but I'd love to hear what others think of that. Only one concern about the implementation — the unit tests would be better if moved into the "YAML suite" of tests found here. These tests have the benefit of testing not just the Go implementation of OPA, but any implementation, like Wasm, IR implementations, and so on. Instructions for how to run individual tests may be found in the top level comment of this file. |
This crossed my mind as well and I like the idea. My only issue with it is that this would probably also require porting this functionality to those other implementations (like wasm), requiring much more work. That said, if porting to wasm would indeed be required in such a case I would propose - in the scope of this PR - implementing it there in the naive way, without patricia trees. Would that be alright? So overall +1. |
Looking great, Jakob 👍 @srenatus will be back next week I think, and I'd appreciate his thoughts on Wasm here. Might also be good to have someone else from the OPA team chip in wrt reusing the startswith/endswith builtins vs. adding new ones. All clear from me though :) |
This PR should now be ready. Sounds good @anderseknert! I'd appreciate you finding more people to chime in then and I'll be looking forward to @srenatus 's review. (Hey @srenatus btw! Long time no see) |
Great contribution (Hi Jacob! 👋). I was about to write that this would be fine without the wasm implementation but it's already there! Awesome. I'll have a look right away. |
Ok, read through this. It's a nice contribution, thank you. And of course, if you overload startswith/endswith, you do need to provide a wasm implementation. I think there's another possibility: keep startswith/endswith as-is, but introduce a new builtin that works with strings, or collections of strings, but has a name that's less confusing. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Some inline comments 🙃 👇
@srenatus that was actually what the PR originally did, see #4997 (comment) 😬 reverting that should be simple enough though I suppose.. |
@srenatus Thanks for the review! I'll hold off with the refactoring until we reach a consensus regarding whether this should be a new function or added to the existing one. As @anderseknert mentioned, the original PR was with a separate function. Naming it was quite hard though ( Overall, I think overloading |
I think the "any" semantics of the haystack collection are not understandable from the name of the built-in... I mean, it's clear what is likely to happen with the array of prefixes -- matching all of them simultaneously is unlikely -- but for the other arg, it's not. So I'd be in favour of a new name. |
@srenatus Do you think the name |
Also @srenatus, do you think its signature should be ((array|set|string),(array|set|string)) (like now), or ((array|set), (array|set))? The former partly duplicates the standard startswith and the |
startswith_crossproduct? 🤔 But let's get more opinions here before taking steps. With crossproduct, string/string args would still fit, I think. |
@srenatus I don't think that will be readable at all for people without an academic background (it's not a strong opinion though, the name is not too important of a matter to me). But yeah, let's get more opinions. I'd appreciate you finding more people to chime in. |
I've been discussing this with @anderseknert today, and we've come up with Also, it's namespaced, as new builtins should be. @ashutosh-narkar, @philipaconrad what do you think about all that? 👀 |
@srenatus That sounds like the best to me so far. Would that still provide overloads for non-collections (so plain strings) in both arguments? I'd prefer that. |
@cube2222 yes, let's do that. It's a little silly to have two built-ins provide the same functionality, but silly ain't so bad :) |
(I suppose we could deprecate |
Btw when we go back to the "new builtin" code path, let's ensure we don't use the deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for bearing with me back and forth on this one. It's a great first contribution!
I think icing on the cake would be test cases asserting eval time type check failures. But it's ok without. WDYT, @anderseknert?
@srenatus Adding them is not a problem, I'm still not completely done here. |
Ok @srenatus @anderseknert please take another look. |
@anderseknert Is there anything else I need to do before this can be merged? |
@cube2222 nothing I can think of except for rebasing on top of main, and squashing your commits :) |
…nd suffix matching. Signed-off-by: Jakub Martin <kubam@spacelift.io>
Great contribution! Thanks @cube2222 👍 |
Great, thanks for merging! And thanks a lot for working closely with me on this, providing quick feedback, and getting this merged! Cheers @anderseknert @srenatus |
@cube2222 we've noticed that as it stands, there's no mention of this being better (performance-wise) than using the naive |
Will add tests after discussing further in the issue.
Fixes #4994