Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid repetitive open/write/close on mlar extract #143

Merged
merged 5 commits into from Sep 29, 2023
Merged

Conversation

commial
Copy link
Contributor

@commial commial commented Feb 1, 2023

mlar extract, when no other argument are provided, is using linear_extract with a FileWriter to extract files.
To avoid consuming too much file descriptor, each file block is written with a:

  1. open
  2. write
  3. close

pattern.
While this is fine for some usage, it could be syscall-intensive on other. For instance, if an archive is made of a files splitted in many chunks, most of the extraction time could be loss in the user<->kernel interaction.

This PR try to improve that behavior by adding a LRU cache (with an arbitrary size of 1000) on the file descriptor pool.

@commial commial added enhancement New feature or request mlar Concerns the mlar utility labels Feb 1, 2023
@github-actions

This comment was marked as off-topic.

@github-actions

This comment was marked as off-topic.

@github-actions
Copy link

Benchmark for 0acaa8b

Click to view benchmark
Test Base PR %
chunk_size_decompress_mutilfiles_random/Layers(0x0)/1024 666.8±53.42ns 653.7±44.28ns -1.96%
chunk_size_decompress_mutilfiles_random/Layers(0x0)/1048576 85.3±7.12µs 86.0±4.12µs +0.82%
chunk_size_decompress_mutilfiles_random/Layers(0x0)/16777216 1432.3±24.89µs 1220.7±26.83µs -14.77%
chunk_size_decompress_mutilfiles_random/Layers(0x0)/65536 5.7±0.79µs 5.5±0.84µs -3.51%
chunk_size_decompress_mutilfiles_random/Layers(COMPRESS)/1024 821.4±453.59µs 821.7±455.04µs +0.04%
chunk_size_decompress_mutilfiles_random/Layers(COMPRESS)/1048576 35.2±0.08ms 34.8±0.15ms -1.14%
chunk_size_decompress_mutilfiles_random/Layers(COMPRESS)/16777216 187.8±0.43ms 187.0±0.43ms -0.43%
chunk_size_decompress_mutilfiles_random/Layers(COMPRESS)/65536 7.6±4.29ms 7.7±4.34ms +1.32%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT | COMPRESS)/1024 1088.9±602.93µs 1086.8±603.70µs -0.19%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT | COMPRESS)/1048576 37.5±0.13ms 37.2±0.20ms -0.80%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT | COMPRESS)/16777216 301.3±0.45ms 303.4±0.53ms +0.70%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT | COMPRESS)/65536 7.9±3.70ms 7.8±3.68ms -1.27%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT)/1024 950.9±240.95µs 951.4±241.26µs +0.05%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT)/1048576 9.9±0.25ms 9.9±0.26ms 0.00%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT)/16777216 144.0±0.04ms 144.0±0.06ms 0.00%
chunk_size_decompress_mutilfiles_random/Layers(ENCRYPT)/65536 1642.4±53.17µs 1644.5±54.25µs +0.13%
failsafe_multiple_layers_repair/Layers(0x0)/4194304 99.0±0.39ms 98.8±0.11ms -0.20%
failsafe_multiple_layers_repair/Layers(COMPRESS)/4194304 149.2±0.38ms 147.4±0.32ms -1.21%
failsafe_multiple_layers_repair/Layers(ENCRYPT | COMPRESS)/4194304 158.5±0.32ms 156.8±0.33ms -1.07%
failsafe_multiple_layers_repair/Layers(ENCRYPT)/4194304 111.7±0.50ms 111.3±0.46ms -0.36%
reader_multiple_layers_multiple_block_size/Layers(0x0)/1024 125.4±9.47ns 116.0±8.75ns -7.50%
reader_multiple_layers_multiple_block_size/Layers(0x0)/1048576 84.7±7.20µs 78.9±10.93µs -6.85%
reader_multiple_layers_multiple_block_size/Layers(0x0)/16777216 1418.5±33.45µs 1242.0±43.83µs -12.44%
reader_multiple_layers_multiple_block_size/Layers(0x0)/65536 5.1±0.73µs 4.4±0.63µs -13.73%
reader_multiple_layers_multiple_block_size/Layers(COMPRESS)/1024 2.1±2.23µs 2.2±2.30µs +4.76%
reader_multiple_layers_multiple_block_size/Layers(COMPRESS)/1048576 2.4±0.02ms 2.4±0.02ms 0.00%
reader_multiple_layers_multiple_block_size/Layers(COMPRESS)/16777216 138.4±0.39ms 138.5±0.48ms +0.07%
reader_multiple_layers_multiple_block_size/Layers(COMPRESS)/65536 146.0±152.95µs 147.1±153.82µs +0.75%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/1024 17.1±2.18µs 17.2±2.13µs +0.58%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/1048576 17.8±0.05ms 17.6±0.04ms -1.12%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/16777216 298.7±2.12ms 293.4±0.58ms -1.77%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/65536 1050.7±114.25µs 1068.5±114.21µs +1.69%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT)/1024 8.2±0.42µs 8.2±0.41µs 0.00%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT)/1048576 8.5±0.25ms 8.5±0.25ms 0.00%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT)/16777216 138.4±0.09ms 138.0±0.09ms -0.29%
reader_multiple_layers_multiple_block_size/Layers(ENCRYPT)/65536 525.8±24.83µs 523.3±24.74µs -0.48%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(0x0)/1024 599.6±36.67ns 637.0±64.80ns +6.24%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(0x0)/1048576 85.5±9.09µs 81.2±6.78µs -5.03%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(0x0)/16777216 1432.4±26.43µs 1203.6±28.26µs -15.97%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(0x0)/65536 5.8±0.66µs 5.4±0.61µs -6.90%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(COMPRESS)/1024 13.6±0.12µs 13.9±0.22µs +2.21%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(COMPRESS)/1048576 11.7±0.03ms 11.8±0.05ms +0.85%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(COMPRESS)/16777216 185.7±0.39ms 186.1±0.68ms +0.22%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(COMPRESS)/65536 727.9±4.74µs 752.2±7.63µs +3.34%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT | COMPRESS)/1024 21.0±0.30µs 21.1±0.19µs +0.48%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT | COMPRESS)/1048576 18.4±0.03ms 18.6±0.05ms +1.09%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT | COMPRESS)/16777216 292.6±0.41ms 295.3±0.75ms +0.92%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT | COMPRESS)/65536 1145.3±9.99µs 1151.3±12.82µs +0.52%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT)/1024 9.8±0.05µs 10.0±0.11µs +2.04%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT)/1048576 8.5±0.01ms 8.5±0.01ms 0.00%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT)/16777216 135.5±0.09ms 135.5±0.08ms 0.00%
reader_multiple_layers_multiple_block_size_multifiles_linear/Layers(ENCRYPT)/65536 544.0±2.92µs 542.7±2.47µs -0.24%
writer_multiple_layers_multiple_block_size/Layers(0x0)/1024 12.4±0.08µs 12.4±0.07µs 0.00%
writer_multiple_layers_multiple_block_size/Layers(0x0)/1048576 12.5±0.12ms 12.4±0.12ms -0.80%
writer_multiple_layers_multiple_block_size/Layers(0x0)/16777216 199.8±0.81ms 198.9±0.83ms -0.45%
writer_multiple_layers_multiple_block_size/Layers(0x0)/65536 777.6±9.26µs 774.5±9.16µs -0.40%
writer_multiple_layers_multiple_block_size/Layers(COMPRESS)/1024 17.6±0.29µs 17.4±0.37µs -1.14%
writer_multiple_layers_multiple_block_size/Layers(COMPRESS)/1048576 22.0±0.33ms 21.7±0.40ms -1.36%
writer_multiple_layers_multiple_block_size/Layers(COMPRESS)/16777216 542.2±1.31ms 531.0±1.89ms -2.07%
writer_multiple_layers_multiple_block_size/Layers(COMPRESS)/65536 1109.7±33.55µs 1088.0±16.78µs -1.96%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/1024 18.1±0.58µs 17.4±0.22µs -3.87%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/1048576 23.7±0.57ms 23.5±0.57ms -0.84%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/16777216 650.8±1.75ms 639.2±9.64ms -1.78%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT | COMPRESS)/65536 1103.1±20.70µs 1099.8±18.49µs -0.30%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT)/1024 21.8±0.07µs 22.1±0.65µs +1.38%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT)/1048576 21.3±0.11ms 21.2±0.10ms -0.47%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT)/16777216 340.7±0.73ms 339.7±0.73ms -0.29%
writer_multiple_layers_multiple_block_size/Layers(ENCRYPT)/65536 1328.2±8.73µs 1326.0±9.35µs -0.17%

@commial commial merged commit e469af9 into master Sep 29, 2023
23 checks passed
@commial commial deleted the extract-keep-fd branch September 29, 2023 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mlar Concerns the mlar utility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant