Add FAISS with RAFT enabled Benchmarking to raft-ann-bench #2026

tarang-jain · 2023-11-29T16:13:09Z

RMM pooled resource for RAFT enabled FAISS
Update yaml config for faiss_gpu_ivf_flat, faiss_gpu_ivf_pq
Small fix FAISS GPU IVFPQ params
Update get_faiss.cmake
Fix SIGSEGV

Notes: The StandardGpuResources object is part of FAISS' index classes. As a result, there is no way of creating a separate Raft handle for each thread without creating a new instance of the whole index object for a new thread. As such, multi-threaded benchmarking for GPU indices will not work.
For CPU indices, @divyegala had seen some other issues. Though this PR is mainly about raft-enabled FAISS GPU indices.

…faiss-ivf

tarang-jain · 2024-04-23T17:14:59Z

/ok to test

…faiss-ivf

tarang-jain · 2024-05-10T23:53:37Z

/ok to test

tarang-jain · 2024-05-11T19:50:23Z

/ok to test

tarang-jain · 2024-05-11T23:22:19Z

/ok to test

tarang-jain · 2024-05-12T01:32:16Z

/ok to test

tarang-jain · 2024-05-12T05:40:52Z

/ok to test

…faiss-ivf

tarang-jain · 2024-05-13T17:59:52Z

/ok to test

…faiss-ivf

tarang-jain · 2024-05-13T23:00:11Z

/ok to test

tfeher

Thanks Tarang for the updates, please find a few smaller comments below.

Regarding multi-threaded benchmarks, I am not sure how far we want to go to fix that:

The current PR works in single thread mode.
Multi threaded version is serialized on the same GPU stream because GpuResource is shared.
A proper solution might need a custom GpuResource object which shares all the resources that we want to share, but allows different stream for each resource. This is specific for our benchmark, so we would need to add such object to the benchmark codebase.
Having separate GpuResource objects for each thread could also work, but we need to prevent excessive TempMem allocations.

Our multi-threaded benchmark mode is not the best fit FAISS indeces, and I am fine with either of these options.

tfeher · 2024-05-14T15:27:42Z

cpp/bench/ann/src/faiss/faiss_gpu_wrapper.h

@@ -208,7 +234,7 @@ void FaissGpu<T>::build(const T* dataset, size_t nrow, cudaStream_t stream)
        nlist_,
        index_ivf->cp.min_points_per_centroid);
    }
-    index_ivf->cp.max_points_per_centroid = max_ppc;
+    index_ivf->cp.max_points_per_centroid = 300;


Could you change this back to max_ppc?

python/raft-ann-bench/src/raft-ann-bench/run/conf/algos/faiss_cpu_ivf_flat.yaml

tfeher · 2024-05-14T15:52:38Z

python/raft-ann-bench/src/raft-ann-bench/run/conf/algos/faiss_cpu_ivf_flat.yaml

+      useFloat16: [False]
+      useRaft: [True]
+    search:
+      nprobe: [2048]


Suggested change

nprobe: [2048]

nprobe: [50]

Thanks for the comments. The StandardGpuResources object is indeed not thread safe according to the docs here.
FAISS multi-threaded docs say that a separate GPU resource must be created for each thread. As such, sharing the same resource among separate threads, as we do in raft-ann-bench would not work. And creating a separate resource for each thread has the memory constraints as you mentioned. I can come to the conclusion that at this time we cannot support multi-threaded benchmarks for FAISS indices. I'll revert my comments referencing the FAISS issues and add a note explaining why we cannot support multi-threaded search.
I have confirmed that single-threaded benchmarks work in both -- latency and throughput mode.

Thanks @tarang-jain and @tfeher for investigating this capability. Given this limitation is in FAISS and not RAFT/cuVS, and I suspect this might not be unique to just FAISS, I wonder if it would be worth adding an accessor on the indivudal raft-ann-bench index types to specify whether or not the index type can be used with the throughput mode (specifically with threads>1). This way, instead of a user encountering unexpected behavior, they can at least get an error that states the index type is not supported.

tfeher · 2024-05-14T18:22:53Z

cpp/bench/ann/src/faiss/faiss_gpu_benchmark.cu

+  rmm::mr::cuda_memory_resource cuda_mr;
+  // Construct a resource that uses a coalescing best-fit pool allocator
+  rmm::mr::pool_memory_resource<rmm::mr::cuda_memory_resource> pool_mr{&cuda_mr};
+  rmm::mr::set_current_device_resource(


Yes setDefaultStream will need to update RAFT handle's default stream. But you would need to make sure that the gpu_resource_ is not a shared object anymore, because then everyone is trying to update the stream for the same object.

In practice we use the GpuResource object as a wrapper around stream. But GpuResource contains other resources like the Temporary memory allocator, which might grab 1.5 GiB per GpuResource object. That can be circumvented by setting TempMem size to 0.

…cpu_ivf_flat.yaml Co-authored-by: Tamas Bela Feher <tfeher@nvidia.com>

tarang-jain · 2024-05-14T18:58:28Z

/ok to test

tarang-jain · 2024-05-21T18:20:59Z

/ok to test

tarang-jain and others added 30 commits June 30, 2023 17:03

Unpack list data kernel

cc9cbd3

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

28484ef

…faiss-ivf

update packing and unpacking functions

e39ee56

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

68bf927

…faiss-ivf

Update codepacker

78d6380

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

49a8834

…faiss-ivf

refactor codepacker (does not build)

897338e

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

c1d80f5

…faiss-ivf

Undo deletions

2a2ee51

undo yaml changes

834dd2c

style

6013429

Update tests, correct make_list_extents

ab6345a

More changes

ed80d1a

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

cdff9e1

…faiss-ivf

debugging

7412272

Working build

700ea82

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

27451c6

…faiss-ivf

rename codepacking api

9d742ef

Updated gtest

d1ef8a1

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

e187147

…faiss-ivf

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

4f233a6

…faiss-ivf

updates

4ee99e3

update testing

22f4f80

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

9f4e22c

…faiss-ivf

updates

c95d1e0

Update testing, pow2

da78c66

Merge branch 'branch-23.08' of https://github.com/rapidsai/raft into …

5cc6dc9

…faiss-ivf

remove unneccessary changes

15db0c6

Delete log.txt

154dc6d

updates

47d6421

tarang-jain added 3 commits May 10, 2024 16:41

Merge branch 'branch-24.06' of https://github.com/rapidsai/raft into …

981a730

…faiss-ivf

undo copyright change

507ce25

remove debug statements

dc14d8b

match func signature

af37e68

make build

e5170a8

tarang-jain added 2 commits May 11, 2024 18:19

add metric conversion func

9df0d73

remove metric parsing bugs

bd1fe4c

include utils header

7295308

tarang-jain and others added 2 commits May 13, 2024 09:50

Merge branch 'branch-24.06' of https://github.com/rapidsai/raft into …

f2f2e3b

…faiss-ivf

Merge branch 'branch-24.06' into faiss-ivf

1838102

tarang-jain added 5 commits May 13, 2024 15:50

bm configs for ivfflat

fe389db

update docs to keep track of FAISS issue

9c5cf50

rm name

09d2422

Merge branch 'faiss-ivf' of https://github.com/tarang-jain/raft into …

d56089d

…faiss-ivf

Merge branch 'branch-24.06' of https://github.com/rapidsai/raft into …

ba2cdd8

…faiss-ivf

tfeher requested changes May 14, 2024

View reviewed changes

tarang-jain and others added 3 commits May 14, 2024 11:41

Update python/raft-ann-bench/src/raft-ann-bench/run/conf/algos/faiss_…

921eadd

…cpu_ivf_flat.yaml Co-authored-by: Tamas Bela Feher <tfeher@nvidia.com>

revert comment, final changes

c91a94b

merge

29e08cb

merge 24.06

cca3927

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FAISS with RAFT enabled Benchmarking to raft-ann-bench #2026

Add FAISS with RAFT enabled Benchmarking to raft-ann-bench #2026

tarang-jain commented Nov 29, 2023 •

edited

tarang-jain commented Apr 23, 2024

tarang-jain commented May 10, 2024

tarang-jain commented May 11, 2024

tarang-jain commented May 11, 2024

tarang-jain commented May 12, 2024

tarang-jain commented May 12, 2024

tarang-jain commented May 13, 2024

tarang-jain commented May 13, 2024

tfeher left a comment

tfeher May 14, 2024

tfeher May 14, 2024

tarang-jain May 14, 2024

cjnolet May 14, 2024

tfeher May 14, 2024

tarang-jain commented May 14, 2024

tarang-jain commented May 21, 2024

Add FAISS with RAFT enabled Benchmarking to raft-ann-bench #2026

Are you sure you want to change the base?

Add FAISS with RAFT enabled Benchmarking to raft-ann-bench #2026

Conversation

tarang-jain commented Nov 29, 2023 • edited

tarang-jain commented Apr 23, 2024

tarang-jain commented May 10, 2024

tarang-jain commented May 11, 2024

tarang-jain commented May 11, 2024

tarang-jain commented May 12, 2024

tarang-jain commented May 12, 2024

tarang-jain commented May 13, 2024

tarang-jain commented May 13, 2024

tfeher left a comment

Choose a reason for hiding this comment

tfeher May 14, 2024

Choose a reason for hiding this comment

tfeher May 14, 2024

Choose a reason for hiding this comment

tarang-jain May 14, 2024

Choose a reason for hiding this comment

cjnolet May 14, 2024

Choose a reason for hiding this comment

tfeher May 14, 2024

Choose a reason for hiding this comment

tarang-jain commented May 14, 2024

tarang-jain commented May 21, 2024

tarang-jain commented Nov 29, 2023 •

edited