Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add contiguous mode to the Builder #234

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
44 changes: 44 additions & 0 deletions examples/stream.rs
@@ -0,0 +1,44 @@
use std::fs::{read_dir, File};
use std::io::Write;
use std::iter::Peekable;
use std::path::PathBuf;

struct TarStream<I: Iterator<Item = PathBuf>> {
files: Peekable<I>,
}

impl<I: Iterator<Item = PathBuf>> TarStream<I> {
pub fn new(files: I) -> Self {
Self {
files: files.peekable(),
}
}

pub fn read(&mut self, buf: &mut Vec<u8>) -> Option<()> {
self.files.next().map(|path| {
let new_path = PathBuf::from("archived").join(&path);
let mut builder = tar::Builder::new(buf);

builder.contiguous(true);
builder.append_path_with_name(path, &new_path).unwrap();

if self.files.peek().is_none() {
builder.finish().unwrap();
}
})
}
}

fn main() {
let files = read_dir("examples")
.unwrap()
.map(|p| p.unwrap().path())
.filter(|p| p.is_file());
let mut buf = Vec::with_capacity(1024 * 1024 * 4);
let mut tar_stream = TarStream::new(files);
let mut output = File::create("examples.tar").unwrap();

while tar_stream.read(&mut buf).is_some() {
output.write(&buf).unwrap();
}
}
35 changes: 25 additions & 10 deletions src/builder.rs
Expand Up @@ -15,6 +15,7 @@ pub struct Builder<W: Write> {
mode: HeaderMode,
follow: bool,
finished: bool,
contiguous: bool,
obj: Option<W>,
}

Expand All @@ -27,6 +28,7 @@ impl<W: Write> Builder<W> {
mode: HeaderMode::Complete,
follow: true,
finished: false,
contiguous: false,
obj: Some(obj),
}
}
Expand All @@ -44,6 +46,13 @@ impl<W: Write> Builder<W> {
self.follow = follow;
}

/// Determines wheter the archive will be automatically finished
/// Contiguous Builder will not finish the archive unless `finish`
/// is explicitly called. Defaults to false.
pub fn contiguous(&mut self, contiguous: bool) {
self.contiguous = contiguous;
}

/// Gets shared reference to the underlying object.
pub fn get_ref(&self) -> &W {
self.obj.as_ref().unwrap()
Expand All @@ -61,13 +70,11 @@ impl<W: Write> Builder<W> {

/// Unwrap this archive, returning the underlying object.
///
/// This function will finish writing the archive if the `finish` function
/// hasn't yet been called, returning any I/O error which happens during
/// that operation.
/// This function will finish writing the archive if it is not contiguous
/// and the `finish` function hasn't yet been called, returning any
/// I/O error which happens during that operation.
pub fn into_inner(mut self) -> io::Result<W> {
if !self.finished {
self.finish()?;
}
self.finalize()?;
Ok(self.obj.take().unwrap())
}

Expand Down Expand Up @@ -204,7 +211,7 @@ impl<W: Write> Builder<W> {
/// operation then this may corrupt the archive.
///
/// Note if the `path` is a directory. This will just add an entry to the archive,
/// rather than contents of the directory.
/// rather than contents of the directory.
///
/// Also note that after all files have been written to an archive the
/// `finish` function needs to be called to finish writing the archive.
Expand Down Expand Up @@ -342,11 +349,11 @@ impl<W: Write> Builder<W> {
)
}

/// Finish writing this archive, emitting the termination sections.
/// Finish writing this archive, emitting the termination sections
///
/// This function should only be called when the archive has been written
/// entirely and if an I/O error happens the underlying object still needs
/// to be acquired.
/// to be acquired or contiguous archive needs to be finished.
///
/// In most situations the `into_inner` method should be preferred.
pub fn finish(&mut self) -> io::Result<()> {
Expand All @@ -356,6 +363,14 @@ impl<W: Write> Builder<W> {
self.finished = true;
self.get_mut().write_all(&[0; 1024])
}

fn finalize(&mut self) -> io::Result<()> {
if !self.contiguous {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since finish() is a public method, I think that it might be best if this check is folded into there? That can just keep the documentation, though, that it is not required to be called.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've introduced finalize since I wanted to give the user the possibility to write the 'footer' section, as you can see in the example.

I could move the check to finish and then create public method write_footer_section. Then this:

let mut builder = tar::Builder::new(...);
builder.skip_footer_section(true);
builder.finish();

would have no effect and would need to be transformed into:

let mut builder = tar::Builder::new(...);
builder.skip_footer_section(true);
builder.write_footer_section();

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think though that this implementation, as-is, doesn't work. If you call finish it ignores the skip_footer_section flag introduced here, which I think is a bug. I think it'd be fine to basically say that finish is optional if skip_footer_section is true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work, please take a look at the example I've added to examples/ or even tests. But I see your point, I will rename contiguous to skip_footer_section and move

self.get_mut().write_all(&[0; 1024]) 

To new public method write_footer_section. Would that work in your opinion?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes while it works I think it's too confusing. I think adding a separate method makes sense. That way there's a flag to disable auto-inclusion of the footer and in that scenario you can either create a tar::Builder at the end with the flag enabled or you can manually call the footer write.

self.finish()
} else {
Ok(())
}
}
}

fn append(mut dst: &mut dyn Write, header: &Header, mut data: &mut dyn Read) -> io::Result<()> {
Expand Down Expand Up @@ -544,6 +559,6 @@ fn append_dir_all(

impl<W: Write> Drop for Builder<W> {
fn drop(&mut self) {
let _ = self.finish();
let _ = self.finalize();
}
}
53 changes: 53 additions & 0 deletions tests/all.rs
Expand Up @@ -1032,3 +1032,56 @@ fn unpack_path_larger_than_windows_max_path() {
// should unpack path greater than windows MAX_PATH length of 260 characters
assert!(ar.unpack(td.path()).is_ok());
}

#[test]
fn contiguous_archive() {
fn append_contiguous(
path: impl AsRef<Path>,
data: &[u8],
buf: &mut Vec<u8>,
finish: bool,
) -> io::Result<()> {
let mut ar = Builder::new(buf);
ar.contiguous(true);
ar.append_data(&mut Header::new_gnu(), path, data)?;
if finish {
ar.finish()
} else {
Ok(())
}
}

const DATA_A: &[u8] = &[1, 2, 3];
const DATA_B: &[u8] = &[4, 5, 6];

let mut ar = Builder::new(Vec::new());

assert!(ar.append_data(&mut Header::new_gnu(), "a", DATA_A).is_ok());
assert!(ar.append_data(&mut Header::new_gnu(), "b", DATA_B).is_ok());

let expected = t!(ar.into_inner());

let mut actual = Vec::new();

assert!(append_contiguous("a", DATA_A, &mut actual, false).is_ok());
assert!(append_contiguous("b", DATA_B, &mut actual, true).is_ok());

assert_eq!(expected, actual);
}

#[test]
fn contiguous_into_inner() {
const DATA: &[u8] = &[1, 2, 3];

let mut ar = Builder::new(Vec::new());
assert!(ar.append_data(&mut Header::new_gnu(), "a", DATA).is_ok());
let expected = t!(ar.into_inner());
let expected = &expected[0..expected.len() - 1024];

let mut ar = Builder::new(Vec::new());
ar.contiguous(true);
assert!(ar.append_data(&mut Header::new_gnu(), "a", DATA).is_ok());
let actual = t!(ar.into_inner());

assert_eq!(expected, &*actual);
}