-
-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weasyprint consumes a lot of memory for long documents #671
Comments
Also, is there a way to make weasyrpint more verbose programmatically? |
WeasyPrint uses the default Python logger, you'll find more information in the Logging section of the documentation.
Everything can be customized with the logging module. WeasyPrint is not really verbose, but it should be enough for your needs.
Please check the logs and tell me if it helps! |
I am just looking fro emulating the
|
The |
Would this be enough? (as in code above)
Because this doesn't work :/ |
No, it wouldn't. You have to use these lines (from the documentation): import logging
logger = logging.getLogger('weasyprint')
logger.addHandler(logging.FileHandler('/path/to/weasyprint.log')) And if you want to get debug messages: logger.setLevel(logging.DEBUG)
|
Perfect, this is exactly what I was looking for. Sorry, I am not a python person, I am Java guy, so these basics really help. |
Okay, so I might need a bit more support, bbecause in logs I am getting this:
I am trying to compile quite a big pdf (200 pages) and I guess there might be some issues with resources. But the gunicorn worker dies without saying anything and one can see it is being restart immediately afterwards. It is hard to debug the problem with no errors :/ |
Okay I think I finally see the issue. To generate 1500 pages, weasyprint consumer nearly 3 GB of memory. (testing file attached) Couldn't it be more effective and maybe write to disk instead of to memory? Maybe config option? |
WeasyPrint is known to be slow and to consume a lot of memory with big documents. Many things have been done to improve performance (see #70 for example). There's no config option to improve this, we need time to improve the current code, increase speed and decrease memory consumption. |
Speed is much better but memory use is still bad…
|
@liZe I know it's not the most satisfying solution, but couldn't you split your HTML in several parts, generate distinct PDFs, and then merge them back together? With templates, you could do this in a clean way, and even parallelize the generation of the distinct bits. I'm assuming that WeasyPrint keeps all the document in memory until the end of the operation, which would explain the increasing consumption of RAM. |
I've created a small website with memory and speed graphs, it gives a good idea about improvements already made, but we have a lot of work left.
It sounds appealing, but some details prevent us from doing this in an easy way. To render page 2, you need to know were page 1 ends, and thus need to render the whole first page before. Another example: you can include the total number of pages in your first page (like "page 1 / 10"), but to get the total number of pages, you first need to render the whole document. The only way to know where pages end is to go through the whole layout step 😒.
It could be fun to write memory to the disk and play with |
Closing, but the discussion can continue in #578. |
This is probably something in combination with gunicorn, most probably not having to do anything with weasyprint, but I am stuck and have no idea which way to go.
I run this script in a docker image which runs mostly fine, but I suspect all the weasyprint logs are thrown away as i never see them logged. Especially in case of an error, I just get an empty result with no error.
What kind of logging tool weasyprint uses? Is it somehow compatible with the way gunicorn logs stuff?
The text was updated successfully, but these errors were encountered: