Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting the text is very slow #254

Open
alekssamos opened this issue Nov 18, 2023 · 0 comments
Open

getting the text is very slow #254

alekssamos opened this issue Nov 18, 2023 · 0 comments

Comments

@alekssamos
Copy link

Hello.
First of all, I want to thank the author of this wonderful library, it is very good, I like it a lot
and it works much faster than beautifulsoup + lxml, the difference is palpable!.
In general, PyQuery is faster than all other existing libraries.

But there is one small problem.
$("selector").text()
This is done in an average of 10 - 25 MS.
I have a huge volume of pages that need to be processed and most of the time the code spends in the function .text()

Is there any way to speed this up?
A large number of parallel threads did not solve the problem. Yes, it turns out faster in multithreaded mode than in single-threaded mode, but not by much.

I also noticed that in Python 3.12 the speed is 1% faster than in 3.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant