Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test unprivileged mode on Openshift #272

Open
adwk67 opened this issue Apr 25, 2023 · 4 comments
Open

Test unprivileged mode on Openshift #272

adwk67 opened this issue Apr 25, 2023 · 4 comments

Comments

@adwk67
Copy link
Member

adwk67 commented Apr 25, 2023

No description provided.

@shalberd
Copy link

shalberd commented Aug 9, 2023

https://docs.stackable.tech/home/stable/secret-operator/

@adwk67 A brief question in general: Why is the secrets operator, i.e. to provision and inject secrets into pods, even a part of the Openshift OLM bundle or a requirement for individual Stackable components such as Airflow?

https://github.com/stackabletech/airflow-operator/blob/main/deploy/olm/23.4.0/metadata/dependencies.yaml

I mean, both under Kubernetes as well as Openshift, Secrets can be used in Pods without any issue, so what is the added value of secret operator really?

https://kubernetes.io/docs/concepts/configuration/secret/#using-a-secret

@adwk67
Copy link
Member Author

adwk67 commented Aug 9, 2023

Thanks for the question @shalberd! The SecretClasses are indeed a bit tricky to understand, but they are
but also very powerful and do more than simply load secrets (our docs maybe need to make this a little clearer!)

Maybe it is easiest to understand by using as an example the Spark job in our Datalake demo that accesses S3.

The definition of the job is here (I have linked the line where the S3 is referenced). And the referenced S3 is then defined here

The flow for the whole thing is roughly like this:

The SecretClass describes for the SecretOperator where it should get credentials. And a SecretClass can then have different
backends (https://docs.stackable.tech/home/stable/secret-operator/secretclass.html#backend). At the moment there are K8S_Search, autotls and kerberoskeytab.

Auto TLS issues TLS certificates, Kerberos sets Kerberos Principals in AD and fetches keytabs and k8s_search looks into secrets in Kubernetes and fetches data from there (e.g. for username/password or for certificates you generated in advance etc.).

Technically the secret operator is a CSI driver, which means the containers get volumes mounted which are filled by the secret operator - and what they are filled with depends on the backend (keytab, certificate, text files with username, ...).

That means what the snippets above do is:

  • store username and password in a secret
  • define a secretclass that points to this secret
  • start a Spark job whose container has a volume mounted on it that references this SecretClass

And when the job starts, the secret operator looks into the SecretClass, sees which Secret it should take, gets the contents of the
from the Secret and writes it into the volume the container gets. And from there the Spark job can use it.

@shalberd
Copy link

shalberd commented Aug 9, 2023

I see, thank you. Yes, that is indeed more than just mounting Kubernetes or Openshift Secrets into containers :-)

Is the secrets operator used in the Airflow Operator besides when using Ldap Integration?

https://github.com/stackabletech/airflow-operator/tree/main/examples

It is good you are looking at CSI and non-root mode for Openshift. As mentioned elsewhere, great work overall, there are a bunch of Red Hat Open Data Hub (basically Kubeflow on Openshift) folks that like your work a lot, too. red Hat has delegated Airflow and Spark as non-core efforts in their Open Data Hub project. My current main context is Open Data Hub Operator on Openshift, but I am also into Airflow for workload orchestration. Red Hat recently took over Elyra from IBM, which can use Airflow as a runtime, or Kubeflow / Tekton Pipelines.

Getting Airflow 2.x running on Openshift is not an easy feat via community helm charts, so what you have accomplished deserves recognition.

@nightkr
Copy link
Member

nightkr commented Aug 10, 2023

To expand on what @adwk67 said, one big reason for the secret operator is that we often need to select the secrets dynamically based on different conditions, like using the correct TLS certificate for the IP address of the node that the Pod is running on.

Vanilla K8s can't do that, since the volumes have to be equal between each replica (with a few exceptions) and are immutable well before the pod has been scheduled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants