Implement _VariableFunctionsClass.empty of torch #331

athitten · 2024-05-01T15:32:36Z

🚀 Feature

Implement torch.empty

Motivation

NeMo Vision Transformer

cc @apaz-cli @tfogal

The text was updated successfully, but these errors were encountered:

k223kim · 2024-05-03T01:49:27Z

Once this is implemented, I guess we can add torch.Tensor as well?

k223kim · 2024-05-03T03:53:53Z

Hi Team! I am trying to figure out how I can allocate memory but not initialize the values in the tensor. I am assuming I could do something like torch.full. However, I am not sure how torch.full is returning a TensorProxy filled with a certain value as it is technically just returning a TensorProxy from prims.py, _full_meta. How does that fill_value come into play? I think I can do something similar but without the filled value. Or, potentially, am I approaching this problem incorrectly? Would appreciate if anyone can point at resources or suggest different methods to implement this!

mruberry · 2024-05-06T16:22:28Z

@k223kim Excellent question!

So you can implement this like by adding a new EMPTY primitive, similar to the RANDN primitive:

lightning-thunder/thunder/core/prims.py

Line 2661 in 06964ac

def _randn_meta(

then implement the EMPTY primitive using the torch executor:

lightning-thunder/thunder/executors/torchex.py

Line 409 in 06964ac

def _randn_prims_transform(

and finally define torch.empty to call clang.empty to call prims.empty, and add a direct implementation of torch.empty to the PyTorch executor, too, for completeness.

The difference between torch.empty and clang.empty is that torch.empty handles PyTorch objects and translates them to thunder objects before calling clang.empty. The implementation of torch.full is an example of this:

lightning-thunder/thunder/torch/__init__.py

Line 379 in 06964ac

def full(

Let me know if you have any additional questions!

k223kim · 2024-05-07T01:29:56Z

Hey @mruberry! Thank you so much for the detailed guidance. I have submitted a draft PR with (hopefully) everything you have mentioned. However, I actually do want to understand how this implementation work in more detail. I would appreciate it if you can answer these questions to help my understanding which will be extremely helpful in future works on Thunder.

Currently, torch.empty calls clang.empty which calls prims.empty. At the very end, it calls _empty_meta which returns a TensorProxy. What I don't get is, this logic seems identical to the implementation of torch.full. How does one fill the tensor with a desired value and the other just presumably allocate memory and not initialize data within it? What makes such difference?
I am trying to understand why we need clang.empty and prims.empty in the first place. Why are some methods only implemented in torch/__init__.py and others implemented in clang and/or prims?

Also, I have a question regarding the test case of torch.empty. In the case of torch.randn, I can see that in opinfos.py, it only checks the shape, device, and dtype consistency. I think something similar should be done for torch.empty as two empty tensors can be allocated in different memory which results in different uninitialized data. Does that make sense to you?

Thanks again for your time reviewing! ⚡️

mruberry · 2024-05-07T14:26:36Z

Hey @mruberry! Thank you so much for the detailed guidance. I have submitted a draft PR with (hopefully) everything you have mentioned. However, I actually do want to understand how this implementation work in more detail. I would appreciate it if you can answer these questions to help my understanding which will be extremely helpful in future works on Thunder.

Will do!

Currently, torch.empty calls clang.empty which calls prims.empty. At the very end, it calls _empty_meta which returns a TensorProxy. What I don't get is, this logic seems identical to the implementation of torch.full. How does one fill the tensor with a desired value and the other just presumably allocate memory and not initialize data within it? What makes such difference?

Functions like thunder.torch.empty and clang.empty and prims.empty are called when thunder constructs its Python program. At this time the meta functions are called to understand what the output of the operations will be, but no computation on the actual tensor data occurs.

After the program is constructed and compiled, it is executed. As part of the compilation process, the symbols like thunder.torch.empty that are recorded when the program is being constructed are translated into calls like torch.empty() that actually manipulate tensor data. It is those calls that are then executed.

So, program construction --> compilation --> execution. thunder.torch.empty and its meta are called at program construction time and don't create any values. torch.empty is called at execution time to create a tensor.

I am trying to understand why we need clang.empty and prims.empty in the first place. Why are some methods only implemented in torch/__init__.py and others implemented in clang and/or prims?

thunder is interested in understanding properties of operations, like how input metadata maps to output metadata, or how to create a grad formula. A lot of these properties can be implicitly defined by decomposing more complicated operators (like the ones in thunder.torch) to simpler operators like the prims. Without the implicit definition of these properties then each torch operator would have to define its own meta function and its own grad function, which would be challenging to maintain.

Additionally, some executors, like nvFuser, are interested in breaking down operations into a series of simpler operations that can be fused. Without the prims, each torch operation would effectively be a primitive, so executors like nvFuser would need special execution logic for each torch operation.

The core language (the "clang"), is intended to be common operations that are more usable than primitive operations and facilitate the creation of language definitions like the torch language definition, or the numpy language definition.

Also, I have a question regarding the test case of torch.empty. In the case of torch.randn, I can see that in opinfos.py, it only checks the shape, device, and dtype consistency. I think something similar should be done for torch.empty as two empty tensors can be allocated in different memory which results in different uninitialized data. Does that make sense to you?

That sounds great!

k223kim · 2024-05-08T14:09:50Z

Thanks so much @mruberry! This helped me greatly with my understanding regard how things work for sure! Appreciate your help as always. 🚀

athitten added the enhancement New feature or request label May 1, 2024

athitten mentioned this issue May 1, 2024

Support NeMo Vision Transformer Model #333

Open

7 tasks

tfogal added the nemo Issues needed to support NVIDIA NeMo models. label May 1, 2024

athitten mentioned this issue May 1, 2024

Support NeMo MegatronCLIPModel #337

Open

11 tasks

tfogal added the triage review label May 1, 2024

This was referenced May 1, 2024

Support NeMo NeVa Model #343

Open

Support NeMo MegatronGPTModel #344

Open

mruberry added good first issue Good for newcomers operators and removed triage review labels May 6, 2024

k223kim mentioned this issue May 7, 2024

add torch.empty #353

Merged

4 tasks

mruberry closed this as completed in #353 May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement _VariableFunctionsClass.empty of torch #331

Implement _VariableFunctionsClass.empty of torch #331

athitten commented May 1, 2024 •

edited by github-actions bot

k223kim commented May 3, 2024

k223kim commented May 3, 2024 •

edited

mruberry commented May 6, 2024 •

edited

k223kim commented May 7, 2024

mruberry commented May 7, 2024

k223kim commented May 8, 2024

Implement _VariableFunctionsClass.empty of torch #331

Implement _VariableFunctionsClass.empty of torch #331

Comments

athitten commented May 1, 2024 • edited by github-actions bot

🚀 Feature

Motivation

k223kim commented May 3, 2024

k223kim commented May 3, 2024 • edited

mruberry commented May 6, 2024 • edited

k223kim commented May 7, 2024

mruberry commented May 7, 2024

k223kim commented May 8, 2024

athitten commented May 1, 2024 •

edited by github-actions bot

k223kim commented May 3, 2024 •

edited

mruberry commented May 6, 2024 •

edited