General MPS op coverage tracking issue #77764

albanD · 2022-05-18T18:12:47Z

philipturner · 2022-05-18T19:34:00Z

Are there any linear algebra ops not implemented in MPS that you have made custom shaders for? Any shaders I could "borrow" from your project (with full credit) and use in my own? Specifically, it would be helpful to have SVD and reverse-mode Cholesky operators.

albanD · 2022-05-18T20:04:10Z

Hey,

There are no custom shaders at the moment as everything we needed for the basic networks we looked at was already provided by MPS (or a set of ops in MPS). Also , required functions that are not in the hot path are simply falling back to CPU for now.

It is mentioned here as this is something that is possible to be done easily within the integration. But not something that is used today.

pzelasko · 2022-05-18T21:42:24Z

I was testing a bunch of speech synthesis and vocoder models, and found the following operators missing so far:

aten::flip
aten::equal
aten::upsample_nearest1d.out

Linux-cpp-lisp · 2022-05-18T23:28:58Z

One vote for a CPU fallback for torch.bincount.

Is there any reason, given the unified memory architecture, that every op not implemented on Metal cannot just fall back to the CPU implementation without memory copy operations? (Based, of course, on my 10,000ft view of the architecture, which I'm sure is wildly oversimplified.)

richardburleigh · 2022-05-19T02:38:14Z

Tip for everyone:

Run your script with PYTORCH_ENABLE_MPS_FALLBACK=1 which will fallback to the CPU.

I'm using a custom build which merges pull request #77791 so am not sure if this is included in the current build (Edit: It's not. You need to build PyTorch yourself with the pull request or trust an online build with it).

gautierdag · 2022-05-19T08:43:29Z

Testing with some huggingface transformers code: + 1 vote for aten::cumsum.out
Tried with the fallback env var but doesn't seem to work for me.

lhoenig · 2022-05-20T11:23:16Z

One missing op I ran into and haven't seen mentioned yet is aten::_unique2.
Edit: This error goes away when passing PYTORCH_ENABLE_MPS_FALLBACK=1 when using the current main branch build. However, instead I get warnings

The operator 'aten::nonzero' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at  /Users/lukas/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)

then

The dst MTL buffer in copy_to_mps is non-contiguous (Triggered internally at  /Users/lukas/pytorch/aten/src/ATen/native/mps/operations/Copy.mm:323.)

and finally the forward pass through my model crashes with

RuntimeError: Placeholder buffer size (7493632) is not large enough to contain the Tensor storage of size 14986944

On cpu it works fine. Could be #77886 I suppose.

Willian-Zhang · 2022-05-20T14:48:26Z

Testing with some huggingface transformers code: + 1 vote for aten::cumsum.out
Tried with the fallback env var but doesn't seem to work for me.

+1
setting PYTORCH_ENABLE_MPS_FALLBACK=1 still results in:

NotImplementedError: Could not run 'aten::cumsum.out' with arguments from the 'MPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::cumsum.out' is only available for these backends: [Dense, Conjugate, UNKNOWN_TENSOR_TYPE_ID, QuantizedXPU, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseCPU, SparseCUDA, SparseHIP, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseXPU, UNKNOWN_TENSOR_TYPE_ID, SparseVE, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, NestedTensorCUDA, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID].

CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:37386 [kernel]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:31637 [kernel]
BackendSelect: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:133 [backend fallback]
Named: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:11 [kernel]
Conjugate: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback]
Negative: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:3288 [kernel]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:13238 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:12585 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:481 [backend fallback]
Autocast: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:324 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/BatchingRegistrations.cpp:1064 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterFunctionalization_3.cpp:12118 [kernel]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:137 [backend fallback]

albanD · 2022-05-20T15:25:47Z

@lhoenig could you open a new separate issue for the cpu fallback failing for you?
The error seems to hint at the fact that you're doing moving across device non-contiguous Tensor. Making sure they are might help as a workaround.
We can continue this discussion in the new issue you will create.

@Willian-Zhang the fallback is ONLY available if you build from source right now. It will be in the nightly build tomorrow (May 21st).

weiji14 · 2022-05-20T19:33:27Z

Would like to add aten::_local_scalar_dense to the list. Also, is it possible to link to some examples in the top post on how we can implement these into Pytorch? I'd love to give it a shot if it's not too hard.

lhoenig · 2022-05-20T20:17:12Z

@albanD Yep, making the Tensors contiguous worked. But yet another issue revealed itself. I created #77977 and #78001.

psobolewskiPhD · 2022-05-20T21:09:49Z

I've got a non supported op: aten::grid_sampler_2d

envs/pytorch-env/lib/python3.9/site-packages/torch/nn/functional.py:4172: UserWarning: The operator 'aten::grid_sampler_2d' is not currently supported on the MPS backend and will fall back to run on the CPU. This may performance implications. (Triggered internally at  /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
  return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum, align_corners)

thipokKub · 2022-05-21T04:22:19Z

Not supported

aten::l1_loss_backward.grad_input
aten::kl_div_backward

Code

X, y = torch.rand(16, 10).to("mps"), torch.rand(16, 1).to("mps")
model = nn.Linear(10, 1).to("mps")
criterion = nn.L1Loss() # nn.KLDivLoss()
loss = criterion(model(X), y)
loss.backward()

Output

NotImplementedError: Could not run 'aten::l1_loss_backward.grad_input' with arguments from the 'MPS' backend

tw-ilson · 2022-05-22T00:48:04Z

Trying to use affine crop from torchvision, and found the operator aten::linspace.out does not seem to be implemented with the MPS backend

nicolasbeglinger · 2022-05-22T15:40:03Z

Trying to use MPS backend with pytorch geometric, and found the operator aten::index.Tensor is not yet implemented.

feesta · 2022-05-22T22:37:15Z

Found the operator 'aten::grid_sampler_2d' is not current implemented for the MPS device.

mooey5775 · 2022-05-23T00:09:24Z

Would be great to add aten::adaptive_max_pool2d to the list - seems to be fairly common and for me useful in some point cloud architectures.

RohanM · 2022-05-23T03:13:15Z

I ran into this error with aten::count_nonzero.dim_IntList (via torch.count_nonzero()). I'll take a look at implementing this op with MPS.

succichang · 2024-05-12T11:13:26Z

Voting for aten::upsample_bicubic2d.out

danadascalescu00 · 2024-05-12T21:52:06Z

🆙 aten::isin.Tensor_Tensor_out Thank you for your hard work!

wuhongsheng · 2024-05-13T02:50:55Z

Voting for aten::angle

FrederikWR · 2024-05-13T19:18:32Z

Voting for aten::isin.Tensor_Tensor_out - thanks!

eifuentes · 2024-05-14T04:32:27Z

Voting for aten::_embedding_bag which is heavily used in recommendation systems e.g. torchrec's own implementation.

SimonvBaal · 2024-05-15T09:50:27Z

Also voting for aten::isin.Tensor_Tensor_out. Thank you!

vobecant · 2024-05-15T16:49:00Z

+1 for aten::upsample_bicubic2d.out

25is · 2024-05-16T02:58:37Z

NotImplementedError: The operator 'aten::index_copy.out' is not currently implemented for the MPS device.

Version: 2.4.0.dev20240515

louisfabrice13 · 2024-05-16T10:06:51Z

Voting for all Conv3d related and Upsample 3d related operations

CCranney · 2024-05-16T13:13:32Z

Voting for:
NotImplementedError: The operator 'aten::_nested_tensor_from_mask_left_aligned' is not currently implemented for the MPS device

Comes up with various masking attempts in self attention modules.

giamic · 2024-05-16T14:19:11Z

+1 for aten::_fft_r2c

masc-it · 2024-05-16T18:22:37Z

voting for: aten::grid_sampler_2d_backward
(RT_DETR model)

danieldanciu · 2024-05-17T06:26:29Z

Voting for aten::isin.Tensor_Tensor_out

garethcthomasdev · 2024-05-17T07:29:54Z

Voting for aten::isin.Tensor_Tensor_out

YannickDamoiseaux · 2024-05-17T07:35:10Z

Also voting for aten::isin.Tensor_Tensor_out

johnnynunez · 2024-05-17T23:58:29Z

nms:
WARNING ⚠️ NMS time limit 2.100s exceeded

X901 · 2024-05-19T12:02:06Z

I tried it multiple time in M1 Ultra

I always get
WARNING ⚠️ NMS time limit [ ]s exceeded
it still not stable, you can't depend on it

I hope it become better in the future

BudgieBird · 2024-05-21T22:18:12Z

Voting for 'aten::isin.Tensor_Tensor_out' as well, appreciate it!

johnnynunez · 2024-05-22T08:34:04Z

I find it unbelievable that with the money apple has, they don't invest in having pytorch natively with all its operations.

pranavchaturved · 2024-05-22T08:49:57Z

They would rather invest in something of their own, which is what they are doing. ml-explore/mlx: MLX: An array framework for Apple silicon (github.com) <https://github.com/ml-explore/mlx>

…

On Wed, May 22, 2024 at 2:05 PM Johnny ***@***.***> wrote: I find it unbelievable that with the money apple has, they don't invest in having pytorch natively with all its operations. — Reply to this email directly, view it on GitHub <#77764 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4V52AFMFA5PG34PRSNAH33ZDRKDZAVCNFSM5WJJ2R42U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGQYTSMRUGQYQ> . You are receiving this because you commented.Message ID: ***@***.***>

johnnynunez · 2024-05-22T08:52:06Z

They would rather invest in something of their own, which is what they are doing. ml-explore/mlx: MLX: An array framework for Apple silicon (github.com) https://github.com/ml-explore/mlx
…
On Wed, May 22, 2024 at 2:05 PM Johnny @.> wrote: I find it unbelievable that with the money apple has, they don't invest in having pytorch natively with all its operations. — Reply to this email directly, view it on GitHub <#77764 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4V52AFMFA5PG34PRSNAH33ZDRKDZAVCNFSM5WJJ2R42U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGQYTSMRUGQYQ . You are receiving this because you commented.Message ID: @.>

Until they have easy conversion with pytorch/jax or similar difficult. Because people usually test locally and then send to the servers to train.....

JustinGuese · 2024-05-22T10:44:37Z

Voting for 'aten::isin.Tensor_Tensor_out' as well, appreciate it!

s0l4r · 2024-05-22T18:20:27Z

Voting for: aten::upsample_bicubic2d.out. Thanks!

Club-d · 2024-05-23T00:29:46Z

Voting for “The operator 'aten::scatter_reduce.two_out'. Thanks

NotImplementedError: The operator '**

aten::scatter_reduce.two_out

' is not currently implemented for the MPS device

vision0array · 2024-05-23T14:30:13Z

voting for 'aten::upsample_bicubic2d.out'.

is not currently implemented for the MPS device.

Raman-Kumar · 2024-05-24T21:34:44Z

@albanD hey, I’m interested in working on aten::max_unpool2d

janboeye · 2024-05-25T01:50:06Z

voting for aten::amp_foreach_non_finite_check_and_unscale

albanD added feature A request for a proper, new feature. triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels May 18, 2022

albanD changed the title ~~General MPS op coverage issue~~ General MPS op coverage tracking issue May 18, 2022

albanD mentioned this issue May 18, 2022

Some operation are not implemented when using mps backend #77754

Closed

albanD mentioned this issue May 18, 2022

NotImplementedError: Could not run 'aten::amax.out' with arguments from the 'MPS' backend. #77776

Closed

albanD mentioned this issue May 18, 2022

NotImplementedError: Could not run 'aten::index.Tensor' on MPS #77794

Closed

This was referenced May 19, 2022

Could not run 'aten::amax.out' with arguments from the 'MPS' backend. #77817

Closed

NotImplementedError: Could not run 'aten::eye.m_out' on MPS #77797

Closed

albanD mentioned this issue May 19, 2022

torch.nn.Conv3D on MPS backend #77818

Closed

weiji14 mentioned this issue May 20, 2022

⚡ DeepSpeed ZeRO Stage 2 model parallel training weiji14/s2s2net#2

Merged

4 tasks

ducha-aiki mentioned this issue May 21, 2022

List of crashing tests in M1 GPU (mps) kornia/kornia#1717

Open

CaSiOFT mentioned this issue May 19, 2024

When running on Apple GPU (MPS), the loss is always nan. KindXiaoming/pykan#199

Open

mashb1t mentioned this issue May 21, 2024

[Bug]: Performance Issues with Fooocus on Mac Mini M2 pro lllyasviel/Fooocus#2975

Closed

5 tasks

adamltyson mentioned this issue May 24, 2024

Updating to Keras 3.0 and migrating to PyTorch brainglobe/cellfinder#418

Open

6 tasks

General MPS op coverage tracking issue #77764

General MPS op coverage tracking issue #77764

Comments

albanD commented May 18, 2022 • edited by kulinseth

This issue is to have a centralized place to list and track work on adding support to new ops for the MPS backend.

philipturner commented May 18, 2022

albanD commented May 18, 2022

pzelasko commented May 18, 2022

Linux-cpp-lisp commented May 18, 2022

richardburleigh commented May 19, 2022 • edited

gautierdag commented May 19, 2022

lhoenig commented May 20, 2022 • edited

Willian-Zhang commented May 20, 2022

albanD commented May 20, 2022

weiji14 commented May 20, 2022 • edited

lhoenig commented May 20, 2022

psobolewskiPhD commented May 20, 2022 • edited

thipokKub commented May 21, 2022

tw-ilson commented May 22, 2022

nicolasbeglinger commented May 22, 2022 • edited

feesta commented May 22, 2022

mooey5775 commented May 23, 2022

RohanM commented May 23, 2022

succichang commented May 12, 2024

danadascalescu00 commented May 12, 2024

wuhongsheng commented May 13, 2024

FrederikWR commented May 13, 2024

eifuentes commented May 14, 2024 • edited

SimonvBaal commented May 15, 2024 • edited

vobecant commented May 15, 2024

25is commented May 16, 2024

louisfabrice13 commented May 16, 2024

CCranney commented May 16, 2024

giamic commented May 16, 2024

masc-it commented May 16, 2024

danieldanciu commented May 17, 2024

garethcthomasdev commented May 17, 2024

YannickDamoiseaux commented May 17, 2024

johnnynunez commented May 17, 2024

X901 commented May 19, 2024

BudgieBird commented May 21, 2024 • edited

johnnynunez commented May 22, 2024

pranavchaturved commented May 22, 2024 via email

johnnynunez commented May 22, 2024

JustinGuese commented May 22, 2024

s0l4r commented May 22, 2024

Club-d commented May 23, 2024

vision0array commented May 23, 2024

Raman-Kumar commented May 24, 2024

janboeye commented May 25, 2024

albanD commented May 18, 2022 •

edited by kulinseth

richardburleigh commented May 19, 2022 •

edited

lhoenig commented May 20, 2022 •

edited

weiji14 commented May 20, 2022 •

edited

psobolewskiPhD commented May 20, 2022 •

edited

nicolasbeglinger commented May 22, 2022 •

edited

eifuentes commented May 14, 2024 •

edited

SimonvBaal commented May 15, 2024 •

edited

BudgieBird commented May 21, 2024 •

edited