Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entry drop function can take a long time #21

Open
wavenator opened this issue Nov 23, 2021 · 6 comments
Open

Entry drop function can take a long time #21

wavenator opened this issue Nov 23, 2021 · 6 comments

Comments

@wavenator
Copy link

The current drop implementation for Entry is as follows:

impl<'a, R: 'a + Read> Drop for Entry<'a, R> {
    fn drop(&mut self) {
        if self.position < self.length {
            // Consume the rest of the data in this entry.
            let mut remaining = self.reader.take(self.length - self.position);
            let _ = io::copy(&mut remaining, &mut io::sink());
        }
    }
}

When dealing with large entries, this function can take a very long time. Could there be a reason for this implementation? Do we really need to consume the reader?

@bjorn3
Copy link
Contributor

bjorn3 commented Nov 23, 2021

This is necessary to seek the reader to the header of the next entry. There is no Seek bound, so reader.seek() can't be used.

@wavenator
Copy link
Author

I'm confused. I'm not familiar with the code but why don't we use Reader::seek here?
Eventually, this drop implementation takes a lot of time when dealing with a lot of entries in a big AR file. Can't it be solved?

@bjorn3
Copy link
Contributor

bjorn3 commented Nov 23, 2021

reader.seek() requires a Seek bound. It could be added, but because it currently isn't, .seek() can't be used without a breaking change. A breaking change may be acceptable. At least it would be fine with me speaking as user of rust-ar.

@wavenator
Copy link
Author

@mdsteele WDYT about adding Seek bound to Reader?

@wavenator
Copy link
Author

BTW what about calling consume?

@mdsteele
Copy link
Owner

Yes, it was written that way to avoid needing a Seek bound. At the time I was trying to stay similar to the API of the tar crate, which also does not require a Seek bound. Once nice feature of not requiring Seek is that you can easily extract a compressed archive by using a reader adapter that provides streaming decompression (for example, GzipReader, which doesn't implement Seek). It'd be nice not to lose that capability.

I poked around just now, though, and it looks like tar recently landed a (not yet published) change to provide an alternate Archive::entries_with_seek method to allow opting in to a more efficient implementation when the underlying reader implements Seek. So that might be a good approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants