Spin-O-Llama

Ollama api implementation for spin

⚠️ Proof of concept: This project is not production ready

Quick Start

Install spin
login to fermeyon cloud
```
spin login
```

clone this repository

git clone https://github.com/BLaZeKiLL/Spin-O-Llama.git
cd Spin-O-Llama

build
```
spin build
```
deploy
```
spin deploy
```

Routes implemented

POST /api/generate

supported request body

{
    "model": "<supported-model>",
    "prompt": "<input prompt>",
    "system": "<system prompt>", // optional, system prompt
    "stream": false, // streaming not supported, has no impact
    "options": { // optional, llm options
        "num_predict": 128,
        "temperature": 0.8,
        "top_p": 0.9,
        "repeat_penalty": 1.1
    } // default values provided above
}

response body

{
    "model": "<model-id>",
    "response": "<output>",
    "done": true
}

POST /api/embeddings

supported request body

{
    "model": "<model-id>", // doesn't matter for now will always use all-minilm-l6-v2
    "prompt": "<input>"
}

response body

{
    "embedding": [<float array>]
}

Model compatibility

generate - llama2-chat, codellama-instruct
embeddings - all-minilm-l6-v2

Contributing

Contributions are welcome for further implementation of the Ollama api that is supported on the spin runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Spin-O-Llama

Quick Start

Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

Spin-O-Llama

Quick Start

Contributing