- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support GCS files without credentials #537
Comments
Thanks! That makes sense, I wasn't aware of that use case. Can you open a PR with a fix? |
I do not suggest we make any code changes. Using an anonymous client is a rare use case and I think forcing someone to be explicit about it is OK. To accomplish what you want you just need to create the anonymous client and pass it into path = "gs://tensorflow-nightly/prod/tensorflow/release/ubuntu_16/gpu_py37_full/nightly_release/18/20190813-010608/github/tensorflow/pip_pkg/tf_nightly_gpu-1.15.0.dev20190813-cp37-cp37m-linux_x86_64.whl"
import smart_open
import google.cloud.storage
client = google.cloud.storage.Client.create_anonymous_client()
f = smart_open.open(path, transport_params=dict(client=client)) EDIT: I just tested this and I can confirm this works |
May I work on it, if no objection and still issue are open? |
I am in favor of documenting this use case explicitly in the README though |
Hm, maybe we can move these recipes for the various storages (S3, GC, HTTPS…) into separate Wiki pages? And link to them from the README. Because I'm worried the README is becoming unwieldy CC @mpenkov . Or is a single comprehensive page better? Needs a TOC though. Btw README is showing a red "build failing" badge at the moment. |
What's the down side of always trying the anonymous client when no credential is found? |
Different behavior than the |
@piskvorky We already have a how-to guide explicitly for capturing edge cases like this. https://github.com/RaRe-Technologies/smart_open/blob/develop/howto.md @petedannemann I agree, let's deal with this in documentation for now. @ppwwyyxx Please feel free to add to that guide using a PR. |
That's reasonable. However I thought the exact goal of this project is to provide simpler and more unified (in other words, less backend-specific) APIs. So this argument doesn't seem very compelling to me. But I'll leave that to maintainers who know more about what's best for the project. |
My understanding is that the goal of this project was to provide a unified API for file like objects . I thought handling authentication to the "file systems" to access these file like objects was expected to be so different from system to system that smart_open defers to the underlying Python package's for each file system for authentication. That is why our |
Problem description
Be able to read public GCS files without providing credentials.
Steps/code to reproduce the problem
Running the above code,
smart_open
failed withwhile
tf.io
is able to successfully download the public file, although with a warning:Since it's possible to download the file, it's best to not require a credential so that public files can be easily downloaded by anyone.
Versions
Linux-5.4.63-1-lts-x86_64-with-glibc2.2.5
Python 3.8.5 (default, Sep 17 2020, 00:56:56)
smart_open 2.1.1
Checklist
Before you create the issue, please make sure you have:
The text was updated successfully, but these errors were encountered: