Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Cloud Storage Operator that could return a list of objects in a folder #39290

Open
1 of 2 tasks
lopezvit opened this issue Apr 27, 2024 · 2 comments
Open
1 of 2 tasks
Labels
kind:documentation provider:google Google (including GCP) related issues type:doc-only Changelog: Doc Only

Comments

@lopezvit
Copy link

Description

Create a new operator inside of airflow.providers.google.cloud.operators.gcs that, given a pattern/preffix would return a list
of files in said folder, similar to the client method:

storage_client.get_bucket(BUCKET_NAME)
bucket.list_blobs(prefix=filename)

Use case/motivation

No response

Related issues

I have a process that runs once a day that reads some *.csv files from storage and process them.
It would be nice to have an operator that would do exactly that, without needing to create custom code for it.
When I create the custom code, probably using the storage hook, I can try to paste it here, but I don't have time to create a PR.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@lopezvit lopezvit added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Apr 27, 2024
@lopezvit
Copy link
Author

Ok, after a bit of googleing I found that it exists!!!
GCSListObjectsOperator
Then, what I believe it is wrong is the documentation, because I didn't found it here:
https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/gcs.html
Should I create a new issue for fixing the documentation or can it continue here?

@RNHTTR
Copy link
Collaborator

RNHTTR commented Apr 27, 2024

You can just edit this Issue to request that this Operator be added to the list of operators in the docs, I think.

@RNHTTR RNHTTR added provider:google Google (including GCP) related issues kind:documentation type:doc-only Changelog: Doc Only and removed kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:documentation provider:google Google (including GCP) related issues type:doc-only Changelog: Doc Only
Projects
None yet
Development

No branches or pull requests

2 participants