Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate_signed_url does not work for virtual_hosted_style=True #1031

Open
whs opened this issue May 4, 2023 · 10 comments
Open

generate_signed_url does not work for virtual_hosted_style=True #1031

whs opened this issue May 4, 2023 · 10 comments
Assignees
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@whs
Copy link

whs commented May 4, 2023

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

If you are still having issues, please be sure to include as much information as possible:

Environment details

  • OS type and version: Arch Linux
  • Python version: Python 3.9.13
  • pip version: pip 21.3.1
  • google-cloud-storage version: 2.3.0

Steps to reproduce

  1. Call blob.generate_signed_url(..., virtual_hosted_style=True)
  2. Use the generated URL

Code example

blob = bucket.blob("example")
blob.generate_signed_url(virtual_hosted_style=True)

Stack trace

<Error>
<Code>SignatureDoesNotMatch</Code>
<Message>
The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method.
</Message>
<StringToSign>
GET 1683172800 /bucket/object
</StringToSign>
</Error>

I believe this is due to that the generated URL is https://bucket.storage.googleapis.com/object so it signs /object instead of /bucket/object

Workaround

out = blob.generate_signed_url()
out = out.replace(
    "https://storage.googleapis.com/{}/".format(bucket_name),
    "https://{}.storage.googleapis.com/".format(bucket_name),
)
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label May 4, 2023
@cojenco cojenco added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. priority: p3 Desirable enhancement or fix. May not be included in next release. labels Aug 9, 2023
@frankyn frankyn added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Aug 11, 2023
@frankyn
Copy link
Member

frankyn commented Aug 11, 2023

Hi @whs,

To clarify, the sample is using V2 and virtual_hosted_style with generate_signed_url() which was primarily added for V4 signed URLs. This is a bug in V2 so reclassifying.

Cloud Storage recommends using V4 over V2 as well.

Here's a sample:

from google.cloud import storage
import datetime

storage = storage.Client()
bucket = storage.bucket("bucket-name")
blob = bucket.get_blob("object-name")

print(blob.generate_signed_url(
        version="v4",
        # This URL is valid for 15 minutes
        expiration=datetime.timedelta(minutes=15),
        # Allow GET requests using this URL.
        method="GET",
        virtual_hosted_style=True,
    ))

@whs
Copy link
Author

whs commented Aug 11, 2023

The v4 signature doesn't work in my use case. I serve cacheable contents to end users. Presigned URL is used as anti-hotlinking mechanism.

In v4 URL scheme the X-Goog-Date is a required field, and generate_signed_url will always use the current time as the value which make the URL always change and the cache won't hit. In v2 signature, there is only customizable expire date field which I can round up from current time to deterministically generate the value.

@frankyn
Copy link
Member

frankyn commented Aug 11, 2023

@whs could you share an example of what you're doing? I've never come across this use case and sounds interesting.

@whs
Copy link
Author

whs commented Aug 11, 2023

Sure, here's the website - https://tipme.in.th/fumihausu . Both the background and the top banner are user-customizable, and use presign URL in the way I mentioned.

As for why virtual host style is required - the site has a content security policy (CSP). I'd prefer not to add domains that mix UGC from other customers, so I can't add storage.googleapis.com to the CSP directly.

There's another use case not on the unauthenticated side - customer can also upload images/audio files to the file manager for using in a live streaming application (eg. OBS) which embed a webpage hosted on us that link to user-supplied input. For example, we provide a "donation alert" webview that user can customize with images or audio. We'd want to cache the uploaded content to make sure they show up on the live stream without buffering, so having deterministic URL. If I make the UGC public, then the user can use the file manager as an image upload host which we want to prevent.

@frankyn
Copy link
Member

frankyn commented Aug 11, 2023

Thanks for the context, I'm still confused by:

In v4 URL scheme the X-Goog-Date is a required field, and generate_signed_url will always use the current time as the value which make the URL always change and the cache won't hit. In v2 signature, there is only customizable expire date field which I can round up from current time to deterministically generate the value.

Oh, do you mean, v2 allows expiration from epoch you can set that time (once) and your caching will handle incoming requests whereas v4 generates a new URL each time because it uses current time for X-Goog-Date which then creates a new URL? How often do you refresh the expiration?

@whs
Copy link
Author

whs commented Aug 12, 2023

In v2 the only time field is expire field. If you inspect our URL you'll see that it is something like 1691820000 so it's current time plus an hour, then rounded down to the top of the hour. There's additional logic to handle edge case close to the top of the hour. Every time you refresh the page in most of the same hour you'll get the same URL.

As GCS supplies Etag and Last-modified to web browsers, the browser will call to the same URL with If-Modified-Since & If-None-Match which matches and return 304. This remove the download time of the content.

In v4, I'm required to add X-Goog-Date. I could do the same thing here rounding down X-Goog-Date but there's no API to set it (I read the source and it seems that explicitly setting X-Goog-Date, even internally is only intended to be used for automated testing)

@colsil
Copy link

colsil commented Sep 18, 2023

I believe I also ran into this today but I was using bucket_bound_hostname instead of virtual_hosted_style. Likely the same issue, and I was able to work around it by using v4 instead of v2.

@frankyn
Copy link
Member

frankyn commented Nov 1, 2023

@cojenco or @andrewsg could one of you look at addressing virtual_hosted_style=True in V2 signed URLs?

@tritone tritone self-assigned this Nov 1, 2023
@tritone
Copy link
Contributor

tritone commented Nov 1, 2023

I believe virtual_hosted_style and bucket_bound_hostname were only intended to work for v4 signing. Other language libraries indicate this restriction, e.g. https://pkg.go.dev/cloud.google.com/go/storage#SignedURLOptions.Style

I'll update the documentation for these options.

@cojenco
Copy link
Contributor

cojenco commented Nov 9, 2023

Identified where the limitations lie within v2 signing and have a proposed fix. Discussed offline with the team - as we have signed url work planned for the quarter, we're moving this to be a part of that project altogether

@cojenco cojenco added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants