New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix and improve timezone cache concurrency #1105
Conversation
Thanks for this! But can you make some changes for me please:
|
Thanks for the quick feedback!
I had to set the "check_same_thread" to False to be able to have different threads read/write. According to the documentation you then have to take care of the serialization yourself:
Sure thing, will fix that. One other thing, should it changed to lazy creation of the database until it is needed or is it ok to have it on module import as it is now? Thank you for working on yfinance, been using it alot lately. |
Lazy. Probably some users never use price data. |
Have updated the PR with lazy init and migration for old tz cache. |
Also let me know if you think the cache code should be broken out to a separate module instead of utils. |
So looks ready for a merge. Let me know if anything else on your mind, otherwise I'll merge it in. |
Ok, thanks |
Instead of doing homegrown caching using CSV files and having to handle tricky concurrency issues use built in python module SQLite (simple SQL-database) for the ticker/timezone-cache.
Hopefully this should lead to less issues and better performance.
I did some performance tests and on my machine (*) adding 1000 tickers and reading them back took 1s. And 2:nd run when they did not need to be added to cache it took 100ms.
Also did some test with running concurrent python scripts reading/writing the same db and it seems to work well.
First run and no existing db or empty:
Second run when all keys was found: