Skip to content

[WIP] S3 auth type kubernetes secret use aws envs in elyra cos jupyterlab #3299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

shalberd
Copy link
Contributor

@shalberd shalberd commented Mar 19, 2025

fixes #3298

What changes were proposed in this pull request?

Elyra runtime config currently uses and stores cos-username and cos-password for communicating with S3-compatible storage verbatim.
There is a kubernetes secret option in the runtime GUI
auth_type KUBERNETES_SECRET
to specify a K8S secret to use in target runtime KFP or Airflow, but even with that setting, username and password are used and stored verbatim from within the workbench / jupyter environment itself, i.e. when the elyra extension communicated with S3.
I have changed cos-username and cos-password from being in the config and being arguments to in all places always coming from the standard envs used
AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
I have also started modifying runtime bootstrapper.py and all the tests.
We will see ... definitely would appreciate input on this.

How was this pull request tested?

not tested yet
using the existing pytest tests
-->

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

…ata config, using env vars directly instead in Elyra

Signed-off-by: shalberd <[email protected]>
Signed-off-by: shalberd <[email protected]>
…cos-user and cos-password arguments

Signed-off-by: shalberd <[email protected]>
@shalberd shalberd marked this pull request as draft March 19, 2025 19:37
Copy link

codecov bot commented Mar 19, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@21bfcd7). Learn more about missing BASE report.

Files with missing lines Patch % Lines
elyra/util/cos.py 0.00% 5 Missing ⚠️
elyra/pipeline/airflow/airflow_metadata.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3299   +/-   ##
=======================================
  Coverage        ?   83.74%           
=======================================
  Files           ?      158           
  Lines           ?    19939           
  Branches        ?      505           
=======================================
  Hits            ?    16698           
  Misses          ?     3057           
  Partials        ?      184           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shalberd
Copy link
Contributor Author

shalberd commented Mar 20, 2025

I am now using the AWS S3 envs in Elyra itself, if, and only if, KUBERNETES_SECRET auth_type is used.
I am also now not requiring, pre-save of runtime config, the fields cos username and cos password when auth_type is KUBERNETES_SECRET. Instead, I require in that case that the AWS S3 envs be present.

@lresende
Copy link
Member

Nice progress @shalberd , make sure you consider and validate the scenario where users are running Jupyter from their laptops and submitting the pipelines to hosted environments on Kubernetes-based runtimes such as kubeflow, this is an important use case/scenario.

@shalberd
Copy link
Contributor Author

shalberd commented Mar 28, 2025

@lresende does Kubeflow pipelines also work with Git integration, or with direct api calls? Elyra-speaking, I mean.
I assume Kubeflow running on K8S uses S3 for pipeline .tar.gz exchange between Elyra and the runtime as well, correct?

@shalberd
Copy link
Contributor Author

shalberd commented Mar 28, 2025

@harshad16 @lresende @caponetto @jiridanek Ok, so I built from my fork branch an Elyra wheel file and tested it with Jupyterlab 4.x and an Airflow runtime config.
Before I started the workbench / jupyterlab, I supplied env vars.
You can do that, supply the two env vars, with docker run and podman run as well, or on your local system if running non-dockerized,
but here is what that looks like in the podSpec for the lab container, either via env, but in this case, with envFrom section that references the elyra cos K8S secret as it is in the jupyterlab workbench namespace.
patched kind: Notebook envFrom:
image001a
As a result of that kind: Notebook podSpec container env reference, I now see the env vars from the K8S secret as env vars of my workbench:
image001b
Container envFrom section in Openshift / K8S:
image003
Jupyterlab 4.2.7
image002
Checking envs present in system, in jupyterlab terminal:

[1001350000@jupyter-2025-sven-test-0 ~]$ env | grep AWS
AWS_SECRET_ACCESS_KEY=5HCCCBBBYCP0Fbu65VAxxxxxZZZZZTTTTWVtmRT1
AWS_ACCESS_KEY_ID=6rXXxxxxxZZZtttTZZZ3eZ

runtime section, note, with COS auth type KUBERNETES_SECRET and only k8s cos secret name entered. Saving does not yield any errors when cos username and cos password are empty, as intended.
image004
Running an Airflow generic pipeline works, too
image005
e.g. I submit it to my Airflow (Elyra will upload tar.gzip to S3, write DAG Code to Gitlab / Github):
image006
sucess: pipeline submitted, Elyra used env vars for communicating with S3:
image007

@shalberd
Copy link
Contributor Author

shalberd commented Mar 28, 2025

Testing case when no env vars present in Jupyterlab when using auth type KUBERNETES_SECRET:
To delete the env vars, you have to restart jupyterlab / the workbench and essentially not provide the env vars anymore. This is how I do that in ODH Dashboard, but you can think of the same for running locally or dockerized. With docker, you just would not pass the env vars, locally, you would unset the env vars on your system.
image008
image009
295)
no more AWS env vars in jupyterlab workbench container.
image010
Because the pod Spec section changed, notebook restarts.
Now, in the runtime config, for auth type KUBERNETES_SECRET, Elyra on save of the runtime config complains that the AWS envs vars are not present.
image011
Terminal in Jupyterlab tells us the same:

[1001350000@jupyter-2025-sven-test-0 ~]$ env | grep AWS
[1001350000@jupyter-2025-sven-test-0 ~]$

The two other cloud object storage authentication types require the same fields as always; no change there on runtime config save button click:
image012
image013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

S3 auth_type KUBERNETES_SECRET should lead to Elyra communicating via ENV Vars, not config field values
2 participants