Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimization with Rust or C extensions #746

Open
juhoinkinen opened this issue Nov 29, 2023 · 1 comment
Open

Performance optimization with Rust or C extensions #746

juhoinkinen opened this issue Nov 29, 2023 · 1 comment

Comments

@juhoinkinen
Copy link
Member

We could investigate the potential and feasibility of using Rust or C for better performance in computationally intensive parts in Annif codebase.

Rust

In some cases the speedup by using Rust instead of Python can be over 10x (or even near 100x) with modest coding effort, see this blog post. There exist tools for semiautomatic Python to Rust transpilation (pyrs, see a case study using pyrs).

C via mypyc

Another option could be to use mypyc to compile typed Python code to C. Referring to blog post Compiling Black with mypyc:

Existing code with type annotations is often 1.5x to 5x faster when compiled. Code tuned for mypyc can be 5x to 10x faster.3


What ever the choice was, first the slowest parts of the current Python code would need to be identified and benchmarked in train and suggest operations. The blog post about porting to Rust uses py-spy profiler, which should have only little overhead (compared to cProfile), can profile also Rust parts, and has a top-like live-view.

If some performance optimization is implemented using a Rust of C extension, it could be an optional dependency to replace some Python code, and could be installed (on some platforms) e.g. with pip install annif[perf].

If these optimizations are going to be implemented, it would be nice to find someone already familiar with Rust (or mypyc) that could help.

@nwagner84
Copy link

nwagner84 commented Feb 22, 2024

Hi!

This is a very good idea! Let me know if you need help in setting up a first experiment/project; this could be a good opportunity for me to get familiar with the Annif codebase. I've already some experience with PyO3 and maturin.

Update

#629 sounds like a good canditate to test a Rust integration. strsim-rs does a great job in fuzzy string matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants