Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] raft-ann-bench.run stuck after sweep in search mode #2257

Open
mikepcw opened this issue Apr 9, 2024 · 0 comments
Open

[BUG] raft-ann-bench.run stuck after sweep in search mode #2257

mikepcw opened this issue Apr 9, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@mikepcw
Copy link

mikepcw commented Apr 9, 2024

Describe the bug
The bench seems to get stuck with very low CPU util, and zero GPU util. The output from the bench script is shown in attached image.
Killed after more than 24 hrs. Attempting to run the .data_export stage fails, presumably because the results are incomplete.

Steps/Code to reproduce bug
python -m raft-ann-bench.run --dataset wiki_all_1M --dataset-path /ai-dataset/wiki-all/ --algorithms raft_cagra --batch-size 10000 -k 10

Expected behavior
The benchmark to return, and be able to complete the .data_export step.

Environment details (please complete the following information):

  • Environment location: bare metal H100 SXM4, Debian 6.1.76-1 (2024-02-01) x86_64
  • Method of RAFT install: conda, Python 3.10

Additional context
How long is the raft-ann-bench.run script with base search set on wiki_all_1M expected to run for?
Screenshot from 2024-04-09 17-50-45

@mikepcw mikepcw added the bug Something isn't working label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant