Add helper ops to support cache conflict misses #2571

sryap · 2024-05-08T18:36:25Z

Summary:
This diff adds helper operators for the cache conflict miss support
enablement in SSD TBE. Changes include:

Extend get_unique_indices_cuda to compute and return inverse
linear indices (the tensor that contains the original positions of
lienar indices before sorting)
Extend lru_cache_find_uncached_cuda to compute and return the
inverse cache sets (the tensor that contains the original positions
of cache sets of unique indices before sorting)
Update SSD backend to support cache conflict misses instead of
failing. The rows that experience conflict misses will be stored in
a scratch pad for TBE kernels to consume. They will be evicted to
SSD once the backward+optimizer step of TBE is completed.
Add ssd_generate_row_addrs for generating row addresses of data
that is fetched from SSD (data can be in either a scratch pad or LXU
cache).

Differential Revision: D55926421

facebook-github-bot · 2024-05-08T18:36:32Z

This pull request was exported from Phabricator. Differential Revision: D55926421

netlify · 2024-05-08T18:36:42Z

❌ Deploy Preview for pytorch-fbgemm-docs failed.

Name	Link
🔨 Latest commit	`fb11951`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/663d8368f460b500084fbda5

Summary: This diff adds helper operators for the cache conflict miss support enablement in SSD TBE. Changes include: - Extend `get_unique_indices_cuda` to compute and return inverse linear indices (the tensor that contains the original positions of lienar indices before sorting) - Extend `lru_cache_find_uncached_cuda` to compute and return the inverse cache sets (the tensor that contains the original positions of cache sets of unique indices before sorting) - Update SSD backend to support cache conflict misses instead of failing. The rows that experience conflict misses will be stored in a scratch pad for TBE kernels to consume. They will be evicted to SSD once the backward+optimizer step of TBE is completed. - Add `ssd_generate_row_addrs` for generating row addresses of data that is fetched from SSD (data can be in either a scratch pad or LXU cache). Differential Revision: D55926421

facebook-github-bot · 2024-05-08T18:38:34Z

This pull request was exported from Phabricator. Differential Revision: D55926421

Summary: This diff adds helper operators for the cache conflict miss support enablement in SSD TBE. Changes include: - Extend `get_unique_indices_cuda` to compute and return inverse linear indices (the tensor that contains the original positions of lienar indices before sorting) - Extend `lru_cache_find_uncached_cuda` to compute and return the inverse cache sets (the tensor that contains the original positions of cache sets of unique indices before sorting) - Update SSD backend to support cache conflict misses instead of failing. The rows that experience conflict misses will be stored in a scratch pad for TBE kernels to consume. They will be evicted to SSD once the backward+optimizer step of TBE is completed. - Add `ssd_generate_row_addrs` for generating row addresses of data that is fetched from SSD (data can be in either a scratch pad or LXU cache). Reviewed By: q10 Differential Revision: D55926421

facebook-github-bot · 2024-05-09T21:35:10Z

This pull request was exported from Phabricator. Differential Revision: D55926421

Summary: This diff adds helper operators for the cache conflict miss support enablement in SSD TBE. Changes include: - Extend `get_unique_indices_cuda` to compute and return inverse linear indices (the tensor that contains the original positions of lienar indices before sorting) - Extend `lru_cache_find_uncached_cuda` to compute and return the inverse cache sets (the tensor that contains the original positions of cache sets of unique indices before sorting) - Update SSD backend to support cache conflict misses instead of failing. The rows that experience conflict misses will be stored in a scratch pad for TBE kernels to consume. They will be evicted to SSD once the backward+optimizer step of TBE is completed. - Add `ssd_generate_row_addrs` for generating row addresses of data that is fetched from SSD (data can be in either a scratch pad or LXU cache). Reviewed By: q10 Differential Revision: D55926421

facebook-github-bot · 2024-05-09T21:35:58Z

This pull request was exported from Phabricator. Differential Revision: D55926421

Summary: This diff adds helper operators for the cache conflict miss support enablement in SSD TBE. Changes include: - Extend `get_unique_indices_cuda` to compute and return inverse linear indices (the tensor that contains the original positions of lienar indices before sorting) - Extend `lru_cache_find_uncached_cuda` to compute and return the inverse cache sets (the tensor that contains the original positions of cache sets of unique indices before sorting) - Update SSD backend to support cache conflict misses instead of failing. The rows that experience conflict misses will be stored in a scratch pad for TBE kernels to consume. They will be evicted to SSD once the backward+optimizer step of TBE is completed. - Add `ssd_generate_row_addrs` for generating row addresses of data that is fetched from SSD (data can be in either a scratch pad or LXU cache). Reviewed By: q10 Differential Revision: D55926421

facebook-github-bot · 2024-05-10T02:16:14Z

This pull request was exported from Phabricator. Differential Revision: D55926421

facebook-github-bot · 2024-05-15T18:40:00Z

This pull request has been merged in 56d21a0.

facebook-github-bot added the cla signed label May 8, 2024

facebook-github-bot added the fb-exported label May 8, 2024

sryap force-pushed the export-D55926421 branch from 0c94a51 to c8c316d Compare May 8, 2024 18:38

sryap force-pushed the export-D55926421 branch from c8c316d to 5dea02b Compare May 9, 2024 21:35

sryap force-pushed the export-D55926421 branch from 5dea02b to df5b8fc Compare May 9, 2024 21:35

sryap force-pushed the export-D55926421 branch from df5b8fc to fb11951 Compare May 10, 2024 02:16

facebook-github-bot closed this in 56d21a0 May 15, 2024

facebook-github-bot added the Merged label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add helper ops to support cache conflict misses #2571

Add helper ops to support cache conflict misses #2571

sryap commented May 8, 2024

facebook-github-bot commented May 8, 2024

netlify bot commented May 8, 2024 •

edited

facebook-github-bot commented May 8, 2024

facebook-github-bot commented May 9, 2024

facebook-github-bot commented May 9, 2024

facebook-github-bot commented May 10, 2024

facebook-github-bot commented May 15, 2024

Add helper ops to support cache conflict misses #2571

Add helper ops to support cache conflict misses #2571

Conversation

sryap commented May 8, 2024

facebook-github-bot commented May 8, 2024

netlify bot commented May 8, 2024 • edited

❌ Deploy Preview for pytorch-fbgemm-docs failed.

facebook-github-bot commented May 8, 2024

facebook-github-bot commented May 9, 2024

facebook-github-bot commented May 9, 2024

facebook-github-bot commented May 10, 2024

facebook-github-bot commented May 15, 2024

netlify bot commented May 8, 2024 •

edited