-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(archive): An implementation of Tar as streams #4548
base: main
Are you sure you want to change the base?
Conversation
From a quick skim, this PR shows promise, but there are opportunities to simplify and make the implementation more straightforward to understand. @crowlKats, can you see this as a viable alternative to #1985? If so, I'll go ahead and help polish this up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why this PR is wroking and mine is not is because this one does not use byte streams. Not sure if this should be a blocking action or not, but its definitively a downgrade when comparing to the non-stream API and doesnt line up with other streams we do in CLI that do file or io handling.
That aside, as I discussed with @kt3k, we are a bit concerned that since this PR implements tar from scratch, this will require a lot more exhaustive & thorough tests
With regards to this comment, I'd have assumed there was an intention to remove all Deno references, making it compatible with other runtimes and therefore not offer file or io handling. Although that was 2y ago now so things may have changed. |
that has nothing to do with what I have said. I am talking about streams behaviour being consistent across similar purpouses, in this case being byte streams instead of normal streams |
@BlackAsLight, we're still determining whether to pursue this PR or #1985. We'll come back to you once a decision has been made. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4548 +/- ##
==========================================
- Coverage 91.44% 91.13% -0.31%
==========================================
Files 480 482 +2
Lines 37324 37741 +417
Branches 5320 5391 +71
==========================================
+ Hits 34132 34397 +265
- Misses 3136 3287 +151
- Partials 56 57 +1 ☔ View full report in Codecov by Sentry. |
Looking into what bytes streams are and how they work, I do think I could implement them into both streams. At least for when they're serving content out. I don't think I'd run into the same problem you're facing in implementing byte streams with it closing early or something. Although I am yet to actually test it. On a side note the byte streams examples on MDN make use of both the byteRequest and enqueue for pushing, but my implementation only works on a pulling method so it would only ever be doing one or the other based off what the user provided in the options. |
I don't really understand why, but for some reason when I added byte support for TarStream it consumed the Adding byte support for the UnTarStream does seem doable from my understanding of it, but will be more complex. Even though you haven't decided if you want to pursue my implementation or not. I'm still going to try and make the improvements in the mean time as I have the time. |
After some playing around with this code, it seems to be occasionally and inconsistently creating bad tarballs with the same information provided. The UnTarStream class can correctly decode the bad tarball, but the tar implementation on my computer seems to sometimes fail to expand it or extract rubbish from it. |
Seems it was a problem with Github Action's Ubuntu environment. |
Added parsePathname function abstracting the logic out of TarStream allowing the developer to validate pathnames before providing them to TarStream hoping it doesn't throw an error and require the archive creation to start all over again.
When the appending file was exactly divisible by 512 bytes, an extra 512 bytes was being appending instead of zero to fill in the gap, causing the next header to be read at the wrong place.
Instead of using enqueue, the leftover bytes are saved for later for the next buffer provided.
Looking at the original issue, #1658, they suggested the names |
I think |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4548 +/- ##
==========================================
- Coverage 92.15% 91.86% -0.29%
==========================================
Files 488 490 +2
Lines 38873 39339 +466
Branches 5405 5486 +81
==========================================
+ Hits 35823 36140 +317
- Misses 2994 3133 +139
- Partials 56 66 +10 ☔ View full report in Codecov by Sentry. |
- To make sure, if TarStreamOptions are being provided, that they are in the correct format so as to not create bad tarballs.
Following on from this pull request, #4538, I didn't know how to revert the changes on that branch so just made a new one since the idea is that I'm creating a new API and not replacing one.
Usage Examples