Releases: BerriAI/litellm
v1.40.9
What's Changed
- fix opentelemetry-semantic-conventions-ai does not exist on LiteLLM Docker by @ishaan-jaff in #4129
- [Feat] OTEL - allow propagating traceparent in headers by @ishaan-jaff in #4133
- Added
mypy
to the Poetrydev
group by @jamesbraza in #4136 - Azure AI support all models by @krrishdholakia in #4134
- feat(utils.py): bump tiktoken dependency to 0.7.0 (gpt-4o token counting support) by @krrishdholakia in #4119
- fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors by @krrishdholakia in #4139
- Allowing inference of LLM provider in
get_supported_openai_params
by @jamesbraza in #4137 - [FEAT] log management endpoint logs to otel by @ishaan-jaff in #4138
New Contributors
- @jamesbraza made their first contribution in #4136
Full Changelog: v1.40.8...v1.40.9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.9
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 95 | 118.26463258740928 | 6.42020613574963 | 0.0 | 1922 | 0 | 78.571060999991 | 1634.9082140000064 |
Aggregated | Passed ✅ | 95 | 118.26463258740928 | 6.42020613574963 | 0.0 | 1922 | 0 | 78.571060999991 | 1634.9082140000064 |
1.40.8.dev1
What's Changed
- fix opentelemetry-semantic-conventions-ai does not exist on LiteLLM Docker by @ishaan-jaff in #4129
- [Feat] OTEL - allow propagating traceparent in headers by @ishaan-jaff in #4133
- Added
mypy
to the Poetrydev
group by @jamesbraza in #4136 - Azure AI support all models by @krrishdholakia in #4134
- feat(utils.py): bump tiktoken dependency to 0.7.0 (gpt-4o token counting support) by @krrishdholakia in #4119
- fix(proxy_server.py): use consistent 400-status code error code for exceeded budget errors by @krrishdholakia in #4139
- Allowing inference of LLM provider in
get_supported_openai_params
by @jamesbraza in #4137 - [FEAT] log management endpoint logs to otel by @ishaan-jaff in #4138
New Contributors
- @jamesbraza made their first contribution in #4136
Full Changelog: v1.40.8-stable...1.40.8.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-1.40.8.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 110.0 | 125.08247401976035 | 6.426077578390951 | 0.0 | 1923 | 0 | 91.96702899998854 | 1106.7971329999864 |
Aggregated | Passed ✅ | 110.0 | 125.08247401976035 | 6.426077578390951 | 0.0 | 1923 | 0 | 91.96702899998854 | 1106.7971329999864 |
v1.40.8-stable
Full Changelog: v1.40.8...v1.40.8-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 147.76399794426771 | 6.297998256290574 | 0.0 | 1884 | 0 | 97.42064800002481 | 1621.3958460000413 |
Aggregated | Passed ✅ | 120.0 | 147.76399794426771 | 6.297998256290574 | 0.0 | 1884 | 0 | 97.42064800002481 | 1621.3958460000413 |
v1.40.8
What's Changed
- [FEAT]- OTEL log litellm request / response by @ishaan-jaff in #4076
- [Feat] Enterprise - Attribute Management changes to Users in Audit Logs by @ishaan-jaff in #4083
- [FEAT]- OTEL Log raw LLM request/response on OTEL by @ishaan-jaff in #4078
- fix(cost_calculator.py): fixes tgai unmapped model pricing by @krrishdholakia in #4085
- fix(utils.py): improved predibase exception mapping by @krrishdholakia in #4080
- [Fix] Litellm sdk - allow ChatCompletionMessageToolCall, and Function to be used as dict by @ishaan-jaff in #4086
- Update together ai pricing by @krrishdholakia in #4087
- [Feature]: Proxy: Support API-Key header in addition to Authorization header by @ishaan-jaff in #4088
- docs - cache controls on
litellm python SDK
by @ishaan-jaff in #4099 - docs: add llmcord.py to side bar nav by @jakobdylanc in #4101
- docs: fix llmcord.py side bar link by @jakobdylanc in #4104
- [FEAT] - viewing spend report per customer / team by @ishaan-jaff in #4105
- feat - log Proxy Server auth errors on OTEL by @ishaan-jaff in #4103
- [Feat] Client Side Fallbacks by @ishaan-jaff in #4107
- Fix typos: Enterpise -> Enterprise by @msabramo in #4110
assistants.md
: Remove extra trailing backslash by @msabramo in #4112assistants.md
: Add "Get a Thread" example by @msabramo in #4114- ui - Fix Test Key dropdown by @ishaan-jaff in #4108
- fix(bedrock_httpx.py): fix tool calling for anthropic bedrock calls w/ streaming by @krrishdholakia in #4106
- fix(proxy_server.py): allow passing in a list of team members by @krrishdholakia in #4084
- fix - show
model group
in Azure ContentPolicy exceptions by @ishaan-jaff in #4116
Client Side Fallbacks: https://docs.litellm.ai/docs/proxy/reliability#test---client-side-fallbacks
Full Changelog: v1.40.7...v1.40.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 169.11120714803027 | 6.281005310183787 | 0.0 | 1878 | 0 | 114.50119100004486 | 1457.4686270000257 |
Aggregated | Passed ✅ | 140.0 | 169.11120714803027 | 6.281005310183787 | 0.0 | 1878 | 0 | 114.50119100004486 | 1457.4686270000257 |
v1.40.7.dev1
What's Changed
- [FEAT]- OTEL log litellm request / response by @ishaan-jaff in #4076
- [Feat] Enterprise - Attribute Management changes to Users in Audit Logs by @ishaan-jaff in #4083
- [FEAT]- OTEL Log raw LLM request/response on OTEL by @ishaan-jaff in #4078
- fix(cost_calculator.py): fixes tgai unmapped model pricing by @krrishdholakia in #4085
- fix(utils.py): improved predibase exception mapping by @krrishdholakia in #4080
- [Fix] Litellm sdk - allow ChatCompletionMessageToolCall, and Function to be used as dict by @ishaan-jaff in #4086
- Update together ai pricing by @krrishdholakia in #4087
- [Feature]: Proxy: Support API-Key header in addition to Authorization header by @ishaan-jaff in #4088
- docs - cache controls on
litellm python SDK
by @ishaan-jaff in #4099
Full Changelog: v1.40.7...v1.40.7.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.7.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 179.79878830216478 | 6.323646865102133 | 0.0 | 1893 | 0 | 111.88137199997072 | 2245.1254659999904 |
Aggregated | Passed ✅ | 140.0 | 179.79878830216478 | 6.323646865102133 | 0.0 | 1893 | 0 | 111.88137199997072 | 2245.1254659999904 |
v1.40.7
Full Changelog: v1.40.6...v1.40.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.7
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 97 | 126.50565680197539 | 6.4278560269757214 | 0.003340881510902142 | 1924 | 1 | 82.64289499999222 | 1316.4627209999935 |
Aggregated | Passed ✅ | 97 | 126.50565680197539 | 6.4278560269757214 | 0.003340881510902142 | 1924 | 1 | 82.64289499999222 | 1316.4627209999935 |
v1.40.6
🚨 Note: LiteLLM Proxy Added opentelemetry
as a dependency on this release. We recommend waiting for a stable release before upgrading your production instances
✅ LiteLLM Python SDK Users: You should be unaffected by this change (opentelemetry
was only added for the proxy server)
🔥 LiteLLM 1.40.6 - Proxy 100+ LLMs AT Scale with our production grade OpenTelemetry logger. Trace LLM API Calls, DB Requests, Cache Cache Requests 👉 Start here: https://docs.litellm.ai/docs/proxy/logging#logging-proxy-inputoutput-in-opentelemetry-format
🐞 [Fix]- Allow redacting messages from slack alerting https://docs.litellm.ai/docs/proxy/alerting#advanced---redacting-messages-from-alerts
🔨 [Refactor] - Refactor proxy_server.py to use common function for add_litellm_data_to_request
✨ [Feat] OpenTelemetry - Log Exceptions from Proxy Server
✨ [FEAT] OpenTelemetry - Log Redis Cache Read / Writes
✨ [FEAT] OpenTelemetry - LOG DB Exceptions
✨ [Feat] OpenTelemetry - Instrument DB Reads
🐞 [Fix] UI - Allow custom logout url and show proxy base url on API Ref Page
What's Changed
- feat(bedrock_httpx.py): add support for bedrock converse api by @krrishdholakia in #4033
- feature - Types for mypy - issue #360 by @mikeslattery in #3925
- [Fix]- Allow redacting
messages
from slack alerting by @ishaan-jaff in #4047 - Fix to support all file types supported by Gemini by @nick-rackauckas in #4055
- [Feat] OTEL - Instrument DB Reads by @ishaan-jaff in #4058
- [Refactor] - Refactor proxy_server.py to use common function for
add_litellm_data_to_request
by @ishaan-jaff in #4065 - [Feat] OTEL - Log Exceptions from Proxy Server by @ishaan-jaff in #4067
- Raw request debug logs - security fix by @krrishdholakia in #4068
- [FEAT] OTEL - Log Redis Cache Read / Writes by @ishaan-jaff in #4070
- [FEAT] OTEL - LOG DB Exceptions by @ishaan-jaff in #4071
- [Fix] UI - Allow custom logout url and show proxy base url on API Ref Page by @ishaan-jaff in #4072
-
New Contributors
- @mikeslattery made their first contribution in #3925
- @nick-rackauckas made their first contribution in #4055
Full Changelog: v1.40.5...v1.40.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 151.53218399526997 | 6.362696017911015 | 0.0 | 1903 | 0 | 109.01354200001379 | 1319.1295889999992 |
Aggregated | Passed ✅ | 130.0 | 151.53218399526997 | 6.362696017911015 | 0.0 | 1903 | 0 | 109.01354200001379 | 1319.1295889999992 |
v1.40.5
What's Changed
- Table format fix and Typo by @SujanShilakar in #4037
- feat: add langfuse metadata via proxy request headers by @ndrsfel in #3990
- Add Ollama as a provider in proxy ui by @sha-ahammed in #4020
- modified docs proxy->logging->langfuse by @syGOAT in #4035
- fix tool usage null content using vertexai by @themrzmaster in #4039
- Fixed openai token counter bug by @Raymond1415926 in #4036
- feat(router.py): enable settting 'order' for a deployment in model list by @krrishdholakia in #4046
- docs: add llmcord.py to projects by @jakobdylanc in #4060
- Fix log message in Custom Callbacks doc by @iwamot in #4061
- refactor: replace 'traceback.print_exc()' with logging library by @krrishdholakia in #4049
- feat(aws_secret_manager.py): Support AWS KMS for Master Key encrption by @krrishdholakia in #4054
- [Feat] Enterprise - Enforce Params in request to LiteLLM Proxy by @ishaan-jaff in #4043
- feat - OTEL set custom service names and custom tracer names by @ishaan-jaff in #4048
New Contributors
- @ndrsfel made their first contribution in #3990
- @sha-ahammed made their first contribution in #4020
- @syGOAT made their first contribution in #4035
- @Raymond1415926 made their first contribution in #4036
- @jakobdylanc made their first contribution in #4060
- @iwamot made their first contribution in #4061
Full Changelog: v1.40.4...v1.40.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 98 | 123.75303621190369 | 6.512790176735744 | 0.0 | 1949 | 0 | 80.83186400000386 | 1991.117886999973 |
Aggregated | Passed ✅ | 98 | 123.75303621190369 | 6.512790176735744 | 0.0 | 1949 | 0 | 80.83186400000386 | 1991.117886999973 |
v1.40.4
What's Changed
- feat: clarify slack alerting message by @nibalizer in #4023
- [Admin UI] Analytics - fix div by 0 error on /model/metrics by @ishaan-jaff in #4021
- Use DEBUG level for curl command logging by @grav in #2980
- feat(create_user_button.tsx): allow admin to invite user to proxy via user-email/pwd invite-links by @krrishdholakia in #4028
- [FIX] Proxy redirect to
PROXY_BASE_URL/ui
after logging in by @ishaan-jaff in #4027 - [Feat] Audit Logs for Key, User, ProxyModel CRUD operations by @ishaan-jaff in #4030
New Contributors
- @nibalizer made their first contribution in #4023
Full Changelog: v1.40.3...v1.40.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 74 | 89.43947919222931 | 6.450062450815326 | 0.0 | 1930 | 0 | 64.37952199996744 | 1143.0389689999743 |
Aggregated | Passed ✅ | 74 | 89.43947919222931 | 6.450062450815326 | 0.0 | 1930 | 0 | 64.37952199996744 | 1143.0389689999743 |
v1.40.3-stable
What's Changed
- feat: clarify slack alerting message by @nibalizer in #4023
New Contributors
- @nibalizer made their first contribution in #4023
Full Changelog: v1.40.3...v1.40.3-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.3-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 166.81647102860174 | 6.3100225495221665 | 0.0 | 1888 | 0 | 109.54055500008053 | 2288.330084999984 |
Aggregated | Passed ✅ | 140.0 | 166.81647102860174 | 6.3100225495221665 | 0.0 | 1888 | 0 | 109.54055500008053 | 2288.330084999984 |