Releases · BerriAI/litellm

12 Jun 07:17

github-actions

v1.40.9

1b21aa6

v1.40.9 Latest

Latest

What's Changed

fix opentelemetry-semantic-conventions-ai does not exist on LiteLLM Docker by @ishaan-jaff in #4129
[Feat] OTEL - allow propagating traceparent in headers by @ishaan-jaff in #4133
Added mypy to the Poetry dev group by @jamesbraza in #4136
Azure AI support all models by @krrishdholakia in #4134
feat(utils.py): bump tiktoken dependency to 0.7.0 (gpt-4o token counting support) by @krrishdholakia in #4119
fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors by @krrishdholakia in #4139
Allowing inference of LLM provider in get_supported_openai_params by @jamesbraza in #4137
[FEAT] log management endpoint logs to otel by @ishaan-jaff in #4138

New Contributors

@jamesbraza made their first contribution in #4136

Full Changelog: v1.40.8...v1.40.9

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.9

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	95	118.26463258740928	6.42020613574963	0.0	1922	0	78.571060999991	1634.9082140000064
Aggregated	Passed ✅	95	118.26463258740928	6.42020613574963	0.0	1922	0	78.571060999991	1634.9082140000064

Contributors

jamesbraza, krrishdholakia, and ishaan-jaff

Assets 4

12 Jun 06:20

github-actions

1.40.8.dev1

a9e61ef

1.40.8.dev1

What's Changed

fix opentelemetry-semantic-conventions-ai does not exist on LiteLLM Docker by @ishaan-jaff in #4129
[Feat] OTEL - allow propagating traceparent in headers by @ishaan-jaff in #4133
Added mypy to the Poetry dev group by @jamesbraza in #4136
Azure AI support all models by @krrishdholakia in #4134
feat(utils.py): bump tiktoken dependency to 0.7.0 (gpt-4o token counting support) by @krrishdholakia in #4119
fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors by @krrishdholakia in #4139
Allowing inference of LLM provider in get_supported_openai_params by @jamesbraza in #4137
[FEAT] log management endpoint logs to otel by @ishaan-jaff in #4138

New Contributors

@jamesbraza made their first contribution in #4136

Full Changelog: v1.40.8-stable...1.40.8.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-1.40.8.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	125.08247401976035	6.426077578390951	0.0	1923	0	91.96702899998854	1106.7971329999864
Aggregated	Passed ✅	110.0	125.08247401976035	6.426077578390951	0.0	1923	0	91.96702899998854	1106.7971329999864

Contributors

jamesbraza, krrishdholakia, and ishaan-jaff

Assets 4

11 Jun 19:28

github-actions

v1.40.8-stable

318abde

v1.40.8-stable

Full Changelog: v1.40.8...v1.40.8-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	120.0	147.76399794426771	6.297998256290574	0.0	1884	0	97.42064800002481	1621.3958460000413
Aggregated	Passed ✅	120.0	147.76399794426771	6.297998256290574	0.0	1884	0	97.42064800002481	1621.3958460000413

Assets 4

11 Jun 06:24

github-actions

v1.40.8

d628bf0

v1.40.8

What's Changed

[FEAT]- OTEL log litellm request / response by @ishaan-jaff in #4076
[Feat] Enterprise - Attribute Management changes to Users in Audit Logs by @ishaan-jaff in #4083
[FEAT]- OTEL Log raw LLM request/response on OTEL by @ishaan-jaff in #4078
fix(cost_calculator.py): fixes tgai unmapped model pricing by @krrishdholakia in #4085
fix(utils.py): improved predibase exception mapping by @krrishdholakia in #4080
[Fix] Litellm sdk - allow ChatCompletionMessageToolCall, and Function to be used as dict by @ishaan-jaff in #4086
Update together ai pricing by @krrishdholakia in #4087
[Feature]: Proxy: Support API-Key header in addition to Authorization header by @ishaan-jaff in #4088
docs - cache controls on litellm python SDK by @ishaan-jaff in #4099
docs: add llmcord.py to side bar nav by @jakobdylanc in #4101
docs: fix llmcord.py side bar link by @jakobdylanc in #4104
[FEAT] - viewing spend report per customer / team by @ishaan-jaff in #4105
feat - log Proxy Server auth errors on OTEL by @ishaan-jaff in #4103
[Feat] Client Side Fallbacks by @ishaan-jaff in #4107
Fix typos: Enterpise -> Enterprise by @msabramo in #4110
assistants.md: Remove extra trailing backslash by @msabramo in #4112
assistants.md: Add "Get a Thread" example by @msabramo in #4114
ui - Fix Test Key dropdown by @ishaan-jaff in #4108
fix(bedrock_httpx.py): fix tool calling for anthropic bedrock calls w/ streaming by @krrishdholakia in #4106
fix(proxy_server.py): allow passing in a list of team members by @krrishdholakia in #4084
fix - show model group in Azure ContentPolicy exceptions by @ishaan-jaff in #4116

Client Side Fallbacks: https://docs.litellm.ai/docs/proxy/reliability#test---client-side-fallbacks

Full Changelog: v1.40.7...v1.40.8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	169.11120714803027	6.281005310183787	0.0	1878	0	114.50119100004486	1457.4686270000257
Aggregated	Passed ✅	140.0	169.11120714803027	6.281005310183787	0.0	1878	0	114.50119100004486	1457.4686270000257

Contributors

msabramo, krrishdholakia, and 2 other contributors

Assets 4

10 Jun 20:43

github-actions

v1.40.7.dev1

d5a1cc2

v1.40.7.dev1

What's Changed

[FEAT]- OTEL log litellm request / response by @ishaan-jaff in #4076
[Feat] Enterprise - Attribute Management changes to Users in Audit Logs by @ishaan-jaff in #4083
[FEAT]- OTEL Log raw LLM request/response on OTEL by @ishaan-jaff in #4078
fix(cost_calculator.py): fixes tgai unmapped model pricing by @krrishdholakia in #4085
fix(utils.py): improved predibase exception mapping by @krrishdholakia in #4080
[Fix] Litellm sdk - allow ChatCompletionMessageToolCall, and Function to be used as dict by @ishaan-jaff in #4086
Update together ai pricing by @krrishdholakia in #4087
[Feature]: Proxy: Support API-Key header in addition to Authorization header by @ishaan-jaff in #4088
docs - cache controls on litellm python SDK by @ishaan-jaff in #4099

Full Changelog: v1.40.7...v1.40.7.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.7.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	179.79878830216478	6.323646865102133	0.0	1893	0	111.88137199997072	2245.1254659999904
Aggregated	Passed ✅	140.0	179.79878830216478	6.323646865102133	0.0	1893	0	111.88137199997072	2245.1254659999904

Contributors

krrishdholakia and ishaan-jaff

Assets 4

08 Jun 16:43

github-actions

v1.40.7

93a3a0c

v1.40.7

Full Changelog: v1.40.6...v1.40.7

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.7

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	97	126.50565680197539	6.4278560269757214	0.003340881510902142	1924	1	82.64289499999222	1316.4627209999935
Aggregated	Passed ✅	97	126.50565680197539	6.4278560269757214	0.003340881510902142	1924	1	82.64289499999222	1316.4627209999935

Assets 4

08 Jun 02:54

github-actions

v1.40.6

c5a611c

v1.40.6

🚨 Note: LiteLLM Proxy Added `opentelemetry` as a dependency on this release. We recommend waiting for a stable release before upgrading your production instances

✅ LiteLLM Python SDK Users: You should be unaffected by this change (`opentelemetry` was only added for the proxy server)

🔥 LiteLLM 1.40.6 - Proxy 100+ LLMs AT Scale with our production grade OpenTelemetry logger. Trace LLM API Calls, DB Requests, Cache Cache Requests 👉 Start here: https://docs.litellm.ai/docs/proxy/logging#logging-proxy-inputoutput-in-opentelemetry-format

🐞 [Fix]- Allow redacting messages from slack alerting https://docs.litellm.ai/docs/proxy/alerting#advanced---redacting-messages-from-alerts

🔨 [Refactor] - Refactor proxy_server.py to use common function for add_litellm_data_to_request

✨ [Feat] OpenTelemetry - Log Exceptions from Proxy Server

✨ [FEAT] OpenTelemetry - Log Redis Cache Read / Writes

✨ [FEAT] OpenTelemetry - LOG DB Exceptions

✨ [Feat] OpenTelemetry - Instrument DB Reads

🐞 [Fix] UI - Allow custom logout url and show proxy base url on API Ref Page

What's Changed

feat(bedrock_httpx.py): add support for bedrock converse api by @krrishdholakia in #4033
feature - Types for mypy - issue #360 by @mikeslattery in #3925
[Fix]- Allow redacting messages from slack alerting by @ishaan-jaff in #4047
Fix to support all file types supported by Gemini by @nick-rackauckas in #4055
[Feat] OTEL - Instrument DB Reads by @ishaan-jaff in #4058
[Refactor] - Refactor proxy_server.py to use common function for add_litellm_data_to_request by @ishaan-jaff in #4065
[Feat] OTEL - Log Exceptions from Proxy Server by @ishaan-jaff in #4067
Raw request debug logs - security fix by @krrishdholakia in #4068
[FEAT] OTEL - Log Redis Cache Read / Writes by @ishaan-jaff in #4070
[FEAT] OTEL - LOG DB Exceptions by @ishaan-jaff in #4071
[Fix] UI - Allow custom logout url and show proxy base url on API Ref Page by @ishaan-jaff in #4072

New Contributors

@mikeslattery made their first contribution in #3925
@nick-rackauckas made their first contribution in #4055

Full Changelog: v1.40.5...v1.40.6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.6

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	130.0	151.53218399526997	6.362696017911015	0.0	1903	0	109.01354200001379	1319.1295889999992
Aggregated	Passed ✅	130.0	151.53218399526997	6.362696017911015	0.0	1903	0	109.01354200001379	1319.1295889999992

Contributors

mikeslattery, krrishdholakia, and 2 other contributors

Assets 4

07 Jun 19:56

github-actions

v1.40.5

6024f9e

v1.40.5

What's Changed

Table format fix and Typo by @SujanShilakar in #4037
feat: add langfuse metadata via proxy request headers by @ndrsfel in #3990
Add Ollama as a provider in proxy ui by @sha-ahammed in #4020
modified docs proxy->logging->langfuse by @syGOAT in #4035
fix tool usage null content using vertexai by @themrzmaster in #4039
Fixed openai token counter bug by @Raymond1415926 in #4036
feat(router.py): enable settting 'order' for a deployment in model list by @krrishdholakia in #4046
docs: add llmcord.py to projects by @jakobdylanc in #4060
Fix log message in Custom Callbacks doc by @iwamot in #4061
refactor: replace 'traceback.print_exc()' with logging library by @krrishdholakia in #4049
feat(aws_secret_manager.py): Support AWS KMS for Master Key encrption by @krrishdholakia in #4054
[Feat] Enterprise - Enforce Params in request to LiteLLM Proxy by @ishaan-jaff in #4043
feat - OTEL set custom service names and custom tracer names by @ishaan-jaff in #4048

New Contributors

@ndrsfel made their first contribution in #3990
@sha-ahammed made their first contribution in #4020
@syGOAT made their first contribution in #4035
@Raymond1415926 made their first contribution in #4036
@jakobdylanc made their first contribution in #4060
@iwamot made their first contribution in #4061

Full Changelog: v1.40.4...v1.40.5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	98	123.75303621190369	6.512790176735744	0.0	1949	0	80.83186400000386	1991.117886999973
Aggregated	Passed ✅	98	123.75303621190369	6.512790176735744	0.0	1949	0	80.83186400000386	1991.117886999973

Contributors

iwamot, themrzmaster, and 8 other contributors

Assets 4

06 Jun 05:17

github-actions

v1.40.4

685d6e4

v1.40.4

What's Changed

feat: clarify slack alerting message by @nibalizer in #4023
[Admin UI] Analytics - fix div by 0 error on /model/metrics by @ishaan-jaff in #4021
Use DEBUG level for curl command logging by @grav in #2980
feat(create_user_button.tsx): allow admin to invite user to proxy via user-email/pwd invite-links by @krrishdholakia in #4028
[FIX] Proxy redirect to PROXY_BASE_URL/ui after logging in by @ishaan-jaff in #4027
[Feat] Audit Logs for Key, User, ProxyModel CRUD operations by @ishaan-jaff in #4030

New Contributors

@nibalizer made their first contribution in #4023

Full Changelog: v1.40.3...v1.40.4

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	74	89.43947919222931	6.450062450815326	0.0	1930	0	64.37952199996744	1143.0389689999743
Aggregated	Passed ✅	74	89.43947919222931	6.450062450815326	0.0	1930	0	64.37952199996744	1143.0389689999743

Contributors

grav, nibalizer, and 2 other contributors

Assets 4

05 Jun 19:41

github-actions

v1.40.3-stable

4b3b1e0

v1.40.3-stable

What's Changed

feat: clarify slack alerting message by @nibalizer in #4023

New Contributors

@nibalizer made their first contribution in #4023

Full Changelog: v1.40.3...v1.40.3-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.3-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	166.81647102860174	6.3100225495221665	0.0	1888	0	109.54055500008053	2288.330084999984
Aggregated	Passed ✅	140.0	166.81647102860174	6.3100225495221665	0.0	1888	0	109.54055500008053	2288.330084999984

Contributors

nibalizer

Assets 4

Releases: BerriAI/litellm

v1.40.9

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

1.40.8.dev1

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.8-stable

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

v1.40.8

What's Changed

Client Side Fallbacks: https://docs.litellm.ai/docs/proxy/reliability#test---client-side-fallbacks

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.7.dev1

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.7

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

v1.40.6

🚨 Note: LiteLLM Proxy Added opentelemetry as a dependency on this release. We recommend waiting for a stable release before upgrading your production instances

✅ LiteLLM Python SDK Users: You should be unaffected by this change (opentelemetry was only added for the proxy server)

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.5

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.4

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.40.3-stable

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

🚨 Note: LiteLLM Proxy Added `opentelemetry` as a dependency on this release. We recommend waiting for a stable release before upgrading your production instances

✅ LiteLLM Python SDK Users: You should be unaffected by this change (`opentelemetry` was only added for the proxy server)