Add support for Open Policy Agent #19532

vagaerg · 2023-10-25T16:49:08Z

Description

This PR supersedes #17940 which was getting hard to follow due to the large number of comments

Additional context and related issues

Docs PR by @mosabua is #20246

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

## Security
* Support access control with Open Policy Agent (OPA)

And make it link to the docs

vagaerg · 2023-10-25T17:07:43Z

From the old PR I believe the only items that remain unresolved:

Grants:

#17940 (comment)

I am happy to remove this and implement a default deny if @dain or others think there is no value to supporting this in OPA.

IMHO OPA could provide a simple central point to enforce global policies as to what users can grant permissions on what catalogs, while deferring more granular permissioning to the connector access control logic for each catalog if needed.

Given that the meaning of a GRANT statement varies greatly across different connectors, users would need to write logic that depends on what catalog is being operated upon, but there is no reason they cannot do this with some logic in OPA.

Allowing OPA to answer an authorization question for a grant query would still let users implement their more specific permissioning logic at the connector access control level if needed, whereas denying here would immediately deny any query. It also lets users decide whether they want to globally allow or deny this style of queries.

I don't have lots of experience with this specific part of Trino, but the rest of the plugin is written with the goal of deferring all authorization requests to OPA such that users can plug in any policy they want without changing any Trino logic; and I think this same goal could be beneficial here.

Fields in the request context

#17940 (comment)

I removed enabledRoles and catalogRoles. I have for now kept extraCredentials pending discussion on whether there is a cleaner way to pass in additional credentials. My original response

We have played around (just for demos, nothing we have actually deployed) with the idea of using extra credentials to pass in JWTs that grant users extra permissions that are validated via OPA. Do you know what would be a better alternative to do this without using extraCredentials?

Including the Trino version number in the request & documenting fields in the request

From discussion on slack with @dain
I would like to add the Trino version to the OpaQueryContext object.

What way would I be able to obtain the Trino version programatically? The only way I've seen is from the NodeManager but that is not passed into the Access Control factory.

This item is pending

Cleaning up `OpaQueryInputResource` builders

#17940 (comment)

I would prefer to defer this for now if @dain is OK with that

Follow-up PR

We have 3 additional contributions we would like to make, happy to open PRs after this is merged:

Addressing the OpaQueryInputResource building mentioned by @dain
Implementing row level filtering and column masking
Adding more options for authenticating against the OPA endpoint: OPA can be configured to require clients authenticate against it with basic credentials or mTLS (docs)

dain · 2023-10-31T06:58:31Z

For extraCredentials, my main concern passing through all extra credentials. My suggestion is that we leave them out for now, but if we want to add them in the future, we add a configuration property for which one to include.

For version, I put up a PR to add it to SystemAccessControlContext #19585

vagaerg · 2023-10-31T10:28:09Z

For extraCredentials, my main concern passing through all extra credentials. My suggestion is that we leave them out for now, but if we want to add them in the future, we add a configuration property for which one to include.

Sounds good, will remove them for now

For version, I put up a PR to add it to SystemAccessControlContext #19585

That would be great! The PR looks good, thank you!

plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaHttpClient.java

dain

Some comments

plugin/trino-opa/README.md

plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaAccessControl.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaBatchAccessControlFilteringUnitTest.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlFilteringUnitTest.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/HttpClientUtils.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlFilteringUnitTest.java

dain

Some comments

dain · 2023-11-02T22:56:41Z

plugin/trino-opa/src/test/java/io/trino/plugin/opa/RequestTestUtilities.java

+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.fail;
+
+public class RequestTestUtilities


final for utilitiy classes

plugin/trino-opa/src/test/java/io/trino/plugin/opa/RequestTestUtilities.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/ResponseTest.java

dain

Looking good

dain · 2023-11-13T20:29:20Z

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlFilteringUnitTest.java

+    @ParameterizedTest(name = "{index}: {0}")
+    @MethodSource("io.trino.plugin.opa.FilteringTestHelpers#emptyInputTestCases")
+    public void testEmptyRequests(
+            BiFunction<OpaAccessControl, SystemSecurityContext, Collection> callable)
+    {


We have been removing Parameterized tests throughtout the code base because they are difficult to understand and debug because the test cases are separated from the test code, and they are often over abstracted into generic functions like this, so a reader can not know what is actually being tested. Please just inline the cases.

dain · 2023-11-13T20:30:17Z

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlFilteringUnitTest.java

+    @ParameterizedTest(name = "{index}: {0}")
+    @MethodSource("io.trino.plugin.opa.FilteringTestHelpers#emptyInputTestCases")
+    public void testEmptyRequests(
+            BiFunction<OpaAccessControl, SystemSecurityContext, Collection> callable)


The Collection type here is a raw type which disables type checking in Java. Add something <?>, or even better set the actual type.

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlSystemTest.java

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlUnitTest.java

dain · 2023-11-13T20:44:27Z

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlUnitTest.java

+    @ParameterizedTest(name = "{index}: {0}")
+    @MethodSource("io.trino.plugin.opa.OpaAccessControlUnitTest#noResourceActionTestCases")


as with the other tests, remove use of @ParameterizedTest

dain · 2023-11-13T20:44:43Z

plugin/trino-opa/src/test/java/io/trino/plugin/opa/OpaAccessControlUnitTest.java

+
+    private static Stream<Arguments> tableWithPropertiesTestCases()
+    {
+        Stream<FunctionalHelpers.Consumer4<OpaAccessControl, SystemSecurityContext, CatalogSchemaTableName, Map>> methods = Stream.of(


Raw Map type

vagaerg · 2023-11-30T17:58:33Z

All comments should be addressed, I've removed parameterization on some tests, particularly the following ones in TestOpaAccessControl:

testNoResourceAction
testFunctionResourceActions
testCanExecuteTableProcedure

The new assertAccessControlMethodBehaviour method will assert the authorizer method under test allows/denies correctly depending on the response returned by OPA and that it throws if an illegal response is returned. This method checks the request being sent to OPA,

If you are happy with the changes made, I will change all the other tests accordingly

Cerebus · 2023-12-14T21:54:49Z

Late to the party, but how is this performant at scale, particularly with the follow-on PR for row-based access control? It seems to me that a REST call per row in the input is going to really stink.

Why not use the Compile API to pre-fill user and context into the policies, then execute the query with the data locally? E.g., https://blog.openpolicyagent.org/write-policy-in-opa-enforce-policy-in-sql-d9d24db93bf4

vagaerg · 2023-12-14T22:12:34Z

That is a good question. From our experimentation it seems to be working well enough as we’re deploying it alongside each worker. The OPA binary is very simple to deploy and run so we just co locate it. That said, I’d love to hear your suggestion on the compile API, I’m not familiar with it. Would this allow us to compile the rego and execute it externally to trino? The row level filtering I wouldn’t expect to be much of an issue, as the actual rows are never sent to OPA. OPA just produces filter expressions that Trino applies locally - kind of like adding an extra WHERE clause Thanks!

…

___________________ Pablo Arteaga This message was sent from a mobile device, apologies for any typos or inadvertent auto-corrects. On 14 Dec 2023, at 21:55, T. Miller ***@***.***> wrote: Late to the party, but how is this performant at scale, particularly with the follow-on PR for row-based access control? It seems to me that a REST call per row in the input is going to really stink. Why not use the Compile API to pre-fill user and context into the policies, then execute the query with the data locally? — Reply to this email directly, view it on GitHub<#19532 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALEL2M36BW75V5O54TMH4MTYJNYTLAVCNFSM6AAAAAA6PWIKPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJWG4YTQMJZGU>. You are receiving this because you authored the thread.Message ID: ***@***.***>

vagaerg · 2023-12-15T11:23:15Z

Late to the party, but how is this performant at scale, particularly with the follow-on PR for row-based access control? It seems to me that a REST call per row in the input is going to really stink.

Why not use the Compile API to pre-fill user and context into the policies, then execute the query with the data locally? E.g., https://blog.openpolicyagent.org/write-policy-in-opa-enforce-policy-in-sql-d9d24db93bf4

Just saw your updated comment with the link, thanks!
That certainly looks interesting, and perhaps something to consider in the future.

IMHO, at the moment we should keep a simple solution where OPA makes a full, final, decision.
This is mostly because Trino has no more context than what is sent to OPA: We are serializing the entirety of the available information that Trino passes down to the SPI, which isn't that large.

In fact, Trino doesn't really know all that much about the data either: in the example you linked, the application contacting OPA is aware of what pets are owned by whom and what clinicians are allowed to interact with them. This is not the case in Trino - many of these are coming from business logic that may be entirely external to Trino, and as such may as well be pushed into OPA.

E.g.: the information as to who owns a table is only partially available in Trino - tables may well be created by scripts and thus be owned by fake service accounts.

That said, I think it would be nice to in the future use this compile logic to allow OPA to return expressions that Trino would execute on the data and make the decision internally. The reason I think we should defer that is two-fold:

This is a first cut of an OPA authorizer: fully externalizing the decision making is what is most in line with the OPA design principles, and also what we believe would cover a majority of users
The complexity involved with using the compile API: turning a JSON blob into something that we would execute on top of the data internally is rather tricky, and raises all sorts of security & design questions

As for the row level filtering, note that row level filtering is done within Trino. There is never a case where each row is sent to OPA, because the OPA plugin never even sees the rows.

The Trino SPI expects authorizers to return a list of ViewExpression objects (code) that contain expressions that are applied internally to further refine the results - kind of like adding an extra WHERE clause.

And, in fact, using row level filtering would bring us one step closer to the system you suggested 😄

vagaerg · 2023-12-21T17:53:52Z

I rebased the PR and bumped the version.
There's a draft PR for row level filtering & column masking here: bloomberg#16 , I will bring it over once we get this one merged. That PR also adds some extra system tests.

I will also update the documentation branch (I'm around halfway through with that) and we can merge that after this PR too

findepi · 2023-12-29T17:17:38Z

cc @trinodb/maintainers for new plugin

mosabua · 2023-12-30T00:44:55Z

plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaConfig.java

+    }
+
+    @Config("opa.allow-permissioning-operations")
+    @ConfigDescription("Whether to allow permissioning operations (GRANT, DENY, ...) as well as role management - OPA will not be queried for any such operations, they will be bulk allowed or denied depending on this setting")


Is this term "permissioning" really what we want to use. Is that some sort of weird terminology used in OPA .. if not I would prefer we use something like
allow-security-operations or allow-sql-security-operations or allow-sql-security-statements .. wdyt ?

I agree, the work "permissioning" is strange, and we should rename the java methods, and the config property. I suggest allow-permission-management-operations or allow-security-management-operations. @mosabua for the docs, we should mention that OPA does not have a permissions management backend for Trino, so these operations will fail regardless... meaning there is no code that takes a SQL GRANT statement and created OPA rules.

shohamyamin · 2024-01-14T10:37:01Z

@vagaerg @mosabua any progress with the pr?

soenkeliebau · 2024-01-18T11:54:03Z

I just had to look up the link for something else as well and was wondering about the status :)

Is it maybe worth getting together for two hours one evening and hashing out the remaining open points? I'd volunteer to write minutes and post them here so everything is documented and public..
@vagaerg @mosabua @dain ?

dain

This is good. I have one comment about changing the work "permissioning", and there is a trivial merge conflict for the main pom.

Can you update these two items, squash and push, and I'll merge it?

dain · 2024-01-26T23:14:38Z

plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaConfig.java

+    }
+
+    @Config("opa.allow-permissioning-operations")
+    @ConfigDescription("Whether to allow permissioning operations (GRANT, DENY, ...) as well as role management - OPA will not be queried for any such operations, they will be bulk allowed or denied depending on this setting")


I agree, the work "permissioning" is strange, and we should rename the java methods, and the config property. I suggest allow-permission-management-operations or allow-security-management-operations. @mosabua for the docs, we should mention that OPA does not have a permissions management backend for Trino, so these operations will fail regardless... meaning there is no code that takes a SQL GRANT statement and created OPA rules.

mosabua · 2024-01-27T00:13:51Z

Sounds good @dain .. once this PR is merged and the details are therefore settled I will update the docs PR to the current status and ask for reviews again from @vagaerg and others.

vagaerg · 2024-01-30T11:29:43Z

I just had to look up the link for something else as well and was wondering about the status :)

Is it maybe worth getting together for two hours one evening and hashing out the remaining open points? I'd volunteer to write minutes and post them here so everything is documented and public.. @vagaerg @mosabua @dain ?

Absolutely, sorry I missed this comment - I am happy to meet up whether for this PR (which will hopefully be merged soon!) or for the follow ups for row level filtering / column masking :)

Thanks a lot for your offer @soenkeliebau !

vagaerg · 2024-01-30T15:45:24Z

Rebase and renaming of the field is now done @dain
I will open a small PR shortly after this is merged to remove all parameterization. I would rather get this one merged soon and follow up on a separate PR if that's fine by you.

Thanks!

sbernauer · 2024-02-01T06:54:50Z

Many many thanks @vagaerg for your perseverance and @dain for merging!

cla-bot bot added the cla-signed label Oct 25, 2023

sbernauer mentioned this pull request Oct 27, 2023

[Tracking] OPA integration 2.0 stackabletech/trino-operator#443

Open

vagaerg force-pushed the add-open-policy-agent branch from 1fb3d6c to 3aec31d Compare October 27, 2023 18:09

vagaerg mentioned this pull request Oct 31, 2023

User impersonation does not always load groups #19598

Open

electrum reviewed Nov 2, 2023

View reviewed changes

plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaHttpClient.java Outdated Show resolved Hide resolved

dain reviewed Nov 2, 2023

View reviewed changes

dain reviewed Nov 13, 2023

View reviewed changes

vagaerg force-pushed the add-open-policy-agent branch from 3aec31d to ecda42a Compare November 30, 2023 17:47

lfrancke mentioned this pull request Dec 18, 2023

Platform Authentication & Authorization stackabletech/issues#438

Open

vagaerg force-pushed the add-open-policy-agent branch from 0365917 to fd6d65c Compare December 21, 2023 15:47

vagaerg mentioned this pull request Dec 21, 2023

Add data masking and filtering to the OPA plugin bloomberg/trino#16

Closed

sbernauer mentioned this pull request Dec 27, 2023

docs: Add missing OPA rules for Trino batched API stackabletech/trino-operator#517

Merged

maltesander mentioned this pull request Dec 27, 2023

Bumped to latest version of Trino. Adjusted overrides in OpaAuthorizer stackabletech/trino-opa-authorizer#32

Closed

mosabua mentioned this pull request Dec 29, 2023

Add docs for OPA access control #20246

Merged

mosabua reviewed Dec 30, 2023

View reviewed changes

soenkeliebau mentioned this pull request Jan 18, 2024

Implement Authorizer stackabletech/hdfs-operator#400

Closed

fhennig mentioned this pull request Jan 25, 2024

Write a RegoRule set for Trino (Sebastian already wrote something) stackabletech/issues#500

Closed

dain reviewed Jan 27, 2024

View reviewed changes

Implement OPA and address code review comments

54873d2

vagaerg force-pushed the add-open-policy-agent branch from fd6d65c to 54873d2 Compare January 30, 2024 15:44

dain merged commit a9253bb into trinodb:master Jan 31, 2024
91 of 92 checks passed

github-actions bot added this to the 438 milestone Jan 31, 2024

mosabua mentioned this pull request Feb 1, 2024

Add Trino 438 release notes #20518

Merged

vagaerg mentioned this pull request Feb 2, 2024

Remove parameterization on trino-opa tests #20556

Merged

Praveen2112 mentioned this pull request Feb 23, 2024

Support automatic type coercion in Delta table creation #20814

Merged

vagaerg mentioned this pull request Mar 4, 2024

OPA: Implement row level filtering and column masking #20921

Merged

ebyhr mentioned this pull request Apr 12, 2024

trino-opa-plugin #9787

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Open Policy Agent #19532

Add support for Open Policy Agent #19532

vagaerg commented Oct 25, 2023 •

edited by mosabua

vagaerg commented Oct 25, 2023

dain commented Oct 31, 2023

vagaerg commented Oct 31, 2023

dain left a comment

dain left a comment

dain Nov 2, 2023

dain left a comment

dain Nov 13, 2023

dain Nov 13, 2023

dain Nov 13, 2023

dain Nov 13, 2023

vagaerg commented Nov 30, 2023

Cerebus commented Dec 14, 2023 •

edited

vagaerg commented Dec 14, 2023 via email

vagaerg commented Dec 15, 2023

vagaerg commented Dec 21, 2023

findepi commented Dec 29, 2023

mosabua Dec 30, 2023

dain Jan 26, 2024

shohamyamin commented Jan 14, 2024

soenkeliebau commented Jan 18, 2024

dain left a comment

dain Jan 26, 2024

mosabua commented Jan 27, 2024

vagaerg commented Jan 30, 2024

vagaerg commented Jan 30, 2024

sbernauer commented Feb 1, 2024

		@ParameterizedTest(name = "{index}: {0}")
		@MethodSource("io.trino.plugin.opa.OpaAccessControlUnitTest#noResourceActionTestCases")

Add support for Open Policy Agent #19532

Add support for Open Policy Agent #19532

Conversation

vagaerg commented Oct 25, 2023 • edited by mosabua

Description

Additional context and related issues

Release notes

vagaerg commented Oct 25, 2023

Grants:

Fields in the request context

Including the Trino version number in the request & documenting fields in the request

Cleaning up OpaQueryInputResource builders

Follow-up PR

dain commented Oct 31, 2023

vagaerg commented Oct 31, 2023

dain left a comment

Choose a reason for hiding this comment

dain left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dain left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vagaerg commented Nov 30, 2023

Cerebus commented Dec 14, 2023 • edited

vagaerg commented Dec 14, 2023 via email

vagaerg commented Dec 15, 2023

vagaerg commented Dec 21, 2023

findepi commented Dec 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shohamyamin commented Jan 14, 2024

soenkeliebau commented Jan 18, 2024

dain left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mosabua commented Jan 27, 2024

vagaerg commented Jan 30, 2024

vagaerg commented Jan 30, 2024

sbernauer commented Feb 1, 2024

vagaerg commented Oct 25, 2023 •

edited by mosabua

Cleaning up `OpaQueryInputResource` builders

Cerebus commented Dec 14, 2023 •

edited