Tar file is empty with a size of zero bytes for small tar entry sizes #836

HofmeisterAn · 2023-06-16T13:09:46Z

Describe the bug

The attached reproducer code shows an issue that occurs when the size of a tar entry is very small, just a few bytes. When attempting to create a tarball for a collection of small files, the tar file turns out to be empty with a size of zero bytes (calling flush etc. does not help). However, if the size of the tar entry is increased, the problem does not occur and the tar file is created correctly.

Reproduction Code

https://dotnetfiddle.net/QLhzBV

Steps to reproduce

Run the .NET Fiddle reproducer

Expected behavior

The tar file should have a size greater than 0 bytes.

Operating System

Windows, macOS, Linux

Framework Version

.NET 7, .NET 6

Additional context

If you change the multiplier in the linked reproducer, for example to 10 (at line 12), the test will execute successfully. Furthermore, if the TarOutputStream does not own the underlying stream, it also works properly:

IsStreamOwner = false;
...
Close();
Assert.True(_stream.Length > 0, "The tar file has a size of zero bytes."); // Runs successfully.

The text was updated successfully, but these errors were encountered:

piksel · 2023-06-16T14:46:33Z

The tar data is not fully written until the TarOutputStream is closed (or when the buffer is flushed, which happens when the entry content is large enough). The .Length is the number of bytes that have been written to the output stream, which doesn't necessarily correlate to the number of bytes written to it.

Furthermore, if the TarOutputStream does not own the underlying stream, it also works properly.

Yes, that is how you should use this with a memory stream.
The underlying stream is normally closed to avoid leaking stream handles, but if you want to opt out of that you add the IsStreamOwner = false and take responsibility for disposing of the stream yourself.

HofmeisterAn · 2023-06-16T14:58:31Z

The tar data is not fully written until the TarOutputStream is closed (or when the buffer is flushed, which happens when the entry content is large enough).

I can not use a closed stream anymore (and for the open stream the data is not flushed). This becomes very difficult and inconvenient when dealing with small tar entries. Using my own stream (where the TarOutputStream is not the stream owner) can be used as a workaround, but it is inconvenient for developers and they may not even be aware of this requirement in the first place.

piksel · 2023-06-16T15:06:57Z

I can not use a closed stream anymore (and for the open stream the data is not flushed).

I don't know what you are trying to do, but you cannot create a valid tar file without closing it, since it needs to add the EOF blocks to the end. If you are extending the TarOutputStream like you are doing in the reproduction code example, then why don't you instead use a TarOutputStream?

I also don't understand how adding IsStreamOwner = false and then using the MemoryStream would be any less convenient. It doesn't matter how large the entries are, after the EOF chunks no new entries can be added, which is why the stream is closed.

Updated .NETFiddle with suggested usage:
https://dotnetfiddle.net/cAH5F2

HofmeisterAn · 2023-06-17T14:27:46Z

since it needs to add the EOF blocks to the end.

Yes, you are correct. There is something I overlooked. For some reason, I thought that closing the stream would only add the EOF marker. However, since I need to send the stream to an HTTP endpoint, I cannot simply close it and be finished like I would usually do with a file stream.

I also don't understand how adding IsStreamOwner = false and then using the MemoryStream would be any less convenient.

By finalizing the tar stream (similar to CloseEntry) while keeping it open, I could save some lines of code. I may have been distracted by noticing that few bytes were neither written nor flushed (without closing it). By the way, that is exactly what I currently do. Thank you for your response and clarification.

HofmeisterAn added the bug label Jun 16, 2023

github-actions bot added the tar Related to TAR file format label Jun 16, 2023

icsharpcode deleted a comment from SourceproStudio Jun 16, 2023

piksel removed the bug label Jun 16, 2023

HofmeisterAn closed this as completed Jun 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tar file is empty with a size of zero bytes for small tar entry sizes #836

Tar file is empty with a size of zero bytes for small tar entry sizes #836

HofmeisterAn commented Jun 16, 2023 •

edited

piksel commented Jun 16, 2023 •

edited

HofmeisterAn commented Jun 16, 2023

piksel commented Jun 16, 2023 •

edited

HofmeisterAn commented Jun 17, 2023

Tar file is empty with a size of zero bytes for small tar entry sizes #836

Tar file is empty with a size of zero bytes for small tar entry sizes #836

Comments

HofmeisterAn commented Jun 16, 2023 • edited

Describe the bug

Reproduction Code

Steps to reproduce

Expected behavior

Operating System

Framework Version

Tags

Additional context

piksel commented Jun 16, 2023 • edited

HofmeisterAn commented Jun 16, 2023

piksel commented Jun 16, 2023 • edited

HofmeisterAn commented Jun 17, 2023

HofmeisterAn commented Jun 16, 2023 •

edited

piksel commented Jun 16, 2023 •

edited

piksel commented Jun 16, 2023 •

edited