Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactor raft-ann-bench: single CLI entrypoint, importable Python module, config validation, and testing #2182

Open
wants to merge 3 commits into
base: branch-24.04
Choose a base branch
from

Conversation

fy
Copy link

@fy fy commented Feb 13, 2024

No breaking changes were made to the existing module raft-ann-bench.
The proposal is for migrating functionalities of raft-ann-bench to the refactored module.

Key features of the refactoring:

  • Refactor raft-ann-bench.run and make it easier to write tests against. Currently the main() and run_build_and_search() functions in python/raft-ann-bench/src/raft-ann-bench/run/__main__.py are rather difficult to test, as they include quite a bit of parameter parsing and config data processing logic. The refactor would break up the functions into smaller functions.
  • Use Pydantic to model the 4 types of config files. This makes config data parsing more robust, as Pydantic performs validation, and it helps remove quite a bit of data parsing code. The 4 types of config files modeled: datasets.yaml as DatasetConfig, algos.yaml as AlgoLibConfig, algo benchmark config (e.g. raft_cagra.yaml) as AlgoConfig, and lastly the benchmark executable's config (e.g. the deprecated deep-image-96-inner.json) as BenchConfig.
  • Single CLI entrypoint. Use Typer to organize the CLI commands in a single entrypoint. This would be helpful for incrementally migrating other functionalities, such as raft-ann-bench.plot. Furthermore, documentation for the CLI can be auto-generated with typer raft_ann_bench.cli utils docs --name raft-ann-bench --output raft_ann_benchmarks_cli.md.
  • Decouple the ANN executable's parameters from the Python module. Preserve the naming of the ANN executable's parameters (i.e. cpp/bench/ann/src/common/benchmark.hpp), and design the CLI to merely passthrough parameters to the ANN executable (e.g. raft-ann-bench run build).
  • Unit tests for the new module.
  • Migration tests for ensuring that the new CLI and the old CLI produce the same results.
  • Make the Python module importable. The current module name raft-ann-bench cannot be easily imported in Python; replacing it with raft_ann_bench will work.

Closes #2153
Closes #2151
Closes #2150
Closes #2149

fy-nv and others added 3 commits February 11, 2024 02:24
* Rename package so that it can be imported in Python.
* Use Pydantic for config file validation.
* Passthrough ANN executable parameters instead of renaming a subset of
  them in the CLI.
* Use Typer for improving CLI interface.
* Move default config files to the `data` directory.
* Add unit tests.
* Add migration test to ensure that the new CLI produces the same
  results as the old one.
* Update cupy and libpyraft dependencies.
@fy fy requested review from a team as code owners February 13, 2024 21:02
Copy link

copy-pr-bot bot commented Feb 13, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@fy fy added the enhancement New feature or request label Feb 13, 2024
@cjnolet cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change and removed enhancement New feature or request labels Feb 14, 2024
@cjnolet
Copy link
Member

cjnolet commented Feb 14, 2024

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
Status: No status
3 participants