New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowdown due to high SYS time during vips image compression when using LD_PRELOAD to force zlib-ng usage #947
Comments
zlib-ng is optimized for larger files. Anything that takes less than 1 second can't be reliably timed with |
well, I see reliable and averaged values for batch-sets of filers and compression just fine when optimizing filters for average time on 10ms but ok: time ./do.sh you can say it is inefficient on small files, but the time it takes is true. |
These are the benchmarks which we have done. We don't use LD_PRELOAD when benchmarking. There is also a deflatebench repository which we use specifically for benchmarking. Have you tried building minigzip in both zlib and zlib-ng and comparing the results? |
@nmoinvaz - no I have not tried to involve minigzip. |
I would be interested in seeing how the output of cmake/configure looks when you compiled this, just to rule out an error in configuration detection. It is also hard for us to reproduce this since you use a 3rd party program, so if you could try to reproduce it with minigzip that would be great (This also helps narrow down the problem) |
@Dead2 it's nice that you want to look into it - That I can help with. I am also attaching my test-dataset: In case you find that zlib-ng can increase the performance of my Rpi4, please tell me how. |
@AndKe Unfortunately you only posted the output of I did some testing myself, this is on RPi 3, not overclocked, running raspbian 64bit. zlib git-develop aarch64 1.2.11 cacf7f1
zlib-ng git-develop aarch64 81f1c8a
As you can see, zlib-ng is a lot faster than zlib in these benchmarks, so I am not sure what is happening on your end. Something I notice though is that when you run zlib-ng tests, most of the extra time is spent in Also, zlib-ng in compat mode is not really ABI compatible, only API compatible. What that means is the application should be re-compiled using zlib-ng, meaning that LD_PRELOAD might not work correctly. Whether this is a symptom of that or not is unknown. |
@Dead2 my reasoning was that the sys time increased per picture processed.. while I assumed the the preload was done only once before running do.sh.
|
@AndKe No, not at all. Do NOT run
It is impossible to run
|
@Dead2 Good morning, yes, I know that I did run ./configure before,(but did not post the results) here I run it again:
make test reveals some problems:
The vulnerabilities are not an issue for this application, because the data I process is 100% self-generated and not user-provided. How should I proceed? |
It is hard to help when we don't know anything about the application you use and you don't perform tests with known tools (minigzip and/or deflatebench).
Unlikely to make any difference at all, but you could test compiling with cmake instead of configure, there are slight differences in detection and configuration between the two. |
@Dead2 The application I use is a python script that uses vips/pyvips to convert the svg it generates (previously attached) to png |
These two lines mean zlib-ng can't do run-time detection of optimizations to speed up the code... There is issue with the used libc. |
@mtl1979 can you please suggest what to do with the "libc" ?
I've added the line with "local" because that's where visa compiles/installs to. |
@AndKe Obviously I want to see what is defined in |
@mtl1979 This is the output of the only sys/auxv.h on the system:
|
@AndKe In my Linux,
|
@mtl1979 ok, that file looks the same,
|
@mtl1979 |
@AndKe The configure errors are gone, but that doesn't mean run-time detection of NEON works, or if NEON optimizations are faster than non-optimized version of the equivalent functions. |
I don't think this problem is related to NEON detection really, since most of the extra CPU-time is in SYS. So this is probably related to LD_PRELOAD. I know python has performance problems related to LD_PRELOAD if it was not compiled with You could perhaps try installing pypy and using that to run vips (if possible) instead of python, that might possibly play nicer with LD_PRELOAD. Pypy is a JIT implementation of python. |
@Dead2 please note that while my main app is python, and uses pyvips - but all these performance tests are done just with vips, and no python involved in them. I too were thinking about LD_PRELOAD.. what dos not make sense to me is that the cost of LD_PAYLOAD should be once, not for every image in the shellscript. Finally, my application will still need pyvips, (I assume some performance loss if I stop working with buffers in python and need to involve file system and vips.) |
@Dead2 I only noticed that both ACLE and NEON tests in configure failed without the patch... Whole issue might be combination of several factors... As we fix or rule out them one by one, zlib-ng should eventually be faster than zlib. |
did anyone try to run my svg file conversion script on some ARM and compared zlib vs zlib-ng ? |
Filetype/contents would not explain the increase in SYS time, that would have been USER time. |
@Dead2 where does minigzip come into the picture in this case? - is that an alternative that vips can use or can minigzip replace pyvips? |
@AndKe minigzip is an application we use extensively to test and benchmark zlib-ng. We are not able to install, learn and test every application that uses zlib/zlib-ng in order to test/debug them, and most of the time it is irrelevant because a bug in zlib-ng will usually also be triggered by other programs, like minigzip. |
@AndKe Can you attach your configure.log? |
@nmoinvaz please see here: |
This is the difference between standard zlib on a Rpi4 running at 2Ghz, and current zlib-ng master:
This is the test image:
test12.zip
The text was updated successfully, but these errors were encountered: