Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Range requests in R2 are limited to 32 bit offset/length #560

Closed
1 task done
himikof opened this issue May 1, 2024 · 1 comment · Fixed by #564
Closed
1 task done

[BUG] Range requests in R2 are limited to 32 bit offset/length #560

himikof opened this issue May 1, 2024 · 1 comment · Fixed by #564

Comments

@himikof
Copy link

himikof commented May 1, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What version of workers-rs are you using?

0.2.0

What version of wrangler are you using?

3.53.0

Describe the bug

The worker::r2::builder::Range enum is defined like this:

pub enum Range {
    OffsetWithLength { offset: u32, length: u32 },
    OffsetWithOptionalLength { offset: u32, length: Option<u32> },
    OptionalOffsetWithLength { offset: Option<u32>, length: u32 },
    Suffix { suffix: u32 },
}

Using u32 for all the quantities makes the interface completely unable to handle files more than 4 GiB in size. I've done a bit of research, and it seems the code casts 32-bit numbers to f64 for JS (essentially supporting 52 bits), and those are converted into u64 in the workerd C++ code.

I can see multiple ways to fix this to varying degrees:

  1. Simply changing Rust high-level definition to u64, low-level R2Range definition to f64, and erroring out during conversion if the number does not fit into f64 without precision loss. Will fix the practical problem with the least effort, but this is certainly a hack. Values read from the response could be silently wrong if they exceed 52 bits, though (should not be a practical issue).
  2. There seems to be a (somehow undocumented) way to pass a Headers instance with pre-made Range header. The Rust code could format the header manually from u64 values, bypassing the JS f64 dance entirely.
  3. Support BigInt in the JS interface to pass through u64 losslessly?

As a side note, the current Range Rust definition seems very weird to me: there is way too much overlap between the first three variants, and some semantics are not clear to me. What does OptionalOffsetWithLength{None, L} even mean? Is it the same as OffsetWithLength{0, L}/OffsetWithOptionalLength{0, Some(L)}? Or Suffix{L}? I think that the Range semantics could be made simpler, and it could be done simultaneously with the required breaking change to fix the data types. Perhaps this could work?

pub enum Range {
    OffsetWithLength { offset: u64, length: u64 },
    FromOffset { offset: u64 },
    Suffix { length: u64 },
}

Steps To Reproduce

No response

@kflansburg
Copy link
Member

Thanks for the detailed description. I think option 1 is most viable, but I will investigate 2 I believe it may just support the standard Range header.

Regarding the API, I think this may have just been copied from the JavaScript API which appears to support the same options. I suspect that the Rust types are the way they are to allow for these 3 options, but not allow for omitting both. I will ask around about what omitting the offset does and see if your proposal is viable.

kflansburg added a commit that referenced this issue May 2, 2024
kflansburg added a commit that referenced this issue May 3, 2024
kflansburg added a commit that referenced this issue May 19, 2024
* Use u64 for R2 range requests

Closes #560

* address comment

* Clarify  options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants