inference

Here are 1,193 public repositories matching this topic...

roboflow / inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated May 31, 2024
Python

openvinotoolkit / openvino

Star

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated May 31, 2024
C++

microsoft / DeepSpeed

Star

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated May 31, 2024
Python

huggingface / optimum-intel

Star

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

optimization intel transformers inference pruning quantization distillation onnx openvino diffusers

Updated May 31, 2024
Jupyter Notebook

cyrusbehr / tensorrt-cpp-api

Star

TensorRT C++ API Tutorial

machine-learning computer-vision cpp inference tensorrt

Updated May 31, 2024
C++

hpcaitech / ColossalAI

Star

Making large AI models cheaper, faster and more accessible

ai deep-learning hpc distributed-computing inference big-model large-scale data-parallelism model-parallelism pipeline-parallelism foundation-models heterogeneous-training

Updated May 31, 2024
Python

google / XNNPACK

Star

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated May 31, 2024
C

Tencent / ncnn

Star

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Updated May 31, 2024
C++

openvinotoolkit / openvino_notebooks

Star

📚 Jupyter notebook tutorials for OpenVINO™

machine-learning computer-vision deep-learning inference openvino

Updated May 31, 2024
Jupyter Notebook

openvinotoolkit / model_server

Star

A scalable inference server for models optimized with OpenVINO™

kubernetes machine-learning cloud ai deep-learning inference edge dag model-serving serving openvino

Updated May 31, 2024
C++

huggingface / huggingface.js

Star

Utilities to use the Hugging Face Hub API

machine-learning inference hub api-client huggingface

Updated May 31, 2024
TypeScript

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.