Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support get_multi_ranges in ObjectStore #2293

Closed
Ted-Jiang opened this issue Aug 3, 2022 · 3 comments · Fixed by #2336
Closed

Support get_multi_ranges in ObjectStore #2293

Ted-Jiang opened this issue Aug 3, 2022 · 3 comments · Fixed by #2336
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface

Comments

@Ted-Jiang
Copy link
Member

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
now we have

/// Return the bytes that are stored at the specified location
/// in the given byte range
async fn get_range(&self, location: &Path, range: Range<usize>) -> Result<Bytes>;

I found in #2110 already support multi ranges

/// Retrieve multiple byte ranges. The default implementation will call `get_bytes` sequentially
fn get_byte_ranges(
&mut self,
ranges: Vec<Range<usize>>,
) -> BoxFuture<'_, Result<Vec<Bytes>>>
where
Self: Send,
{
async move {
let mut result = Vec::with_capacity(ranges.len());

we should support same location path with multi ranges read in object_store,

like

async fn get_ranges(&self, location: &Path, range: &[<usize>]) -> Result<Bytes>; 

This will reduce some overhead like in HDFS:
We need to get file metadata for each location from metadata server(nameNode), if we can get multi ranges in one location will reduce some rpc call in system.
Describe the solution you'd like

Describe alternatives you've considered

Additional context

@Ted-Jiang Ted-Jiang added the enhancement Any new improvement worthy of a entry in the changelog label Aug 3, 2022
@Ted-Jiang
Copy link
Member Author

Ted-Jiang commented Aug 3, 2022

@alamb PTAL 😊

@tustvold
Copy link
Contributor

tustvold commented Aug 3, 2022

Perhaps

async fn get_ranges(&self, location: &Path, ranges: &[usize]) -> Result<Vec<Bytes>>; 

With a default serial implementation?

@Ted-Jiang
Copy link
Member Author

Perhaps

async fn get_ranges(&self, location: &Path, ranges: &[usize]) -> Result<Vec<Bytes>>; 

With a default serial implementation?

Sure!

tustvold added a commit to tustvold/arrow-rs that referenced this issue Aug 5, 2022
tustvold added a commit that referenced this issue Aug 8, 2022
* Add ObjectStore::get_ranges (#2293)

* Review feedback
@alamb alamb added the object-store Object Store Interface label Aug 17, 2022
@alamb alamb changed the title Support get_multi_ranges in ObjectStore Support get_multi_ranges in ObjectStore Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants