Skip to content

LittleLittleCloud/Torchsharp-llama

Repository files navigation

Torchsharp LLaMA

Inspired by pytorch-llama, this project implements LLaMA 2 from scratch with TorchSharp

Prerequisites

  • git lfs
  • .NET 6.0 SDK
  • Access to one of LLaMA 2 models

How to run

Note

Please download the pth version (the one without -hf prefix)

  • Change the path in Program.cs to the folder where you download the model weight.
  • Determine the right torchsharp runtime nuget package on your platform.
    • use TorchSharp-cuda-linux if you are on linux and have a nvidia gpu
    • use TorchSharp-cuda-windows if you are on windows and have a nvidia gpu
    • use TorchSharp-cpu if you don't have a nvidia gpu
  • Run the project using dotnet run

About tokenizer

This project uses a BPE tokenizer from Microsoft.ML.Tokenizer to tokenize the input text. You can find the vocab.json and merges.txt under torcharp-llama. To use a third-party tokenizer, you can simply replace the vocab.json and merges.txt with your own tokenizer files.

Disclaimer

This project is only tested with LLaMA-2-7B model. I do hope I can have the chance to test it with other models, but unfortunately 7B model is already the largest model I can afford to run on my machine. If you have chance to test other models, please let me know if it works or not. Thanks!

Also, this project doesn't come with any warranty. Use it at your own risk.

TODO

  • Add support to load from .safetensor and native ckpt file so that we don't need to convert the model to torchsharp format. The support for .safetensor should be an easy one, but the support for native ckpt file is a bit tricky (otherwise why torchsharp format exists in the first place)
  • Add support for lora training

About

Implement llama 2/3 using torchsharp

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages