docs add 1.55.8 changelog

This commit is contained in:
Ishaan Jaff 2024-12-21 20:51:39 -08:00
parent 5a8f67c171
commit 291229ee46
2 changed files with 27 additions and 5 deletions

View file

@ -2,12 +2,34 @@
A new LiteLLM Stable release just went out. Here are 5 updates since v1.52.2-stable. A new LiteLLM Stable release just went out. Here are 5 updates since v1.52.2-stable.
## Langfuse Prompt Management: ## Langfuse Prompt Management
This makes it easy to run experiments or change the specific models `gpt-4o` to `gpt-4o-mini` on Langfuse, instead of making changes in your applications. [Start here](https://docs.litellm.ai/docs/proxy/prompt_management)
## Control fallback prompts client-side ## Control fallback prompts client-side
## Triton /infer support > Claude prompts are different than OpenAI
## Infinity Rerank Models Pass in prompts specific to model when doing fallbacks. [Start here](https://docs.litellm.ai/docs/proxy/reliability#control-fallback-prompts)
## New Providers / Models
- [NVIDIA Triton](https://developer.nvidia.com/triton-inference-server) `/infer` endpoint. [Start here](https://docs.litellm.ai/docs/providers/triton-inference-server)
- [Infinity](https://github.com/michaelfeil/infinity) Rerank Models [Start here](https://docs.litellm.ai/docs/providers/infinity)
## ✨ Azure Data Lake Storage Support
Send LLM usage (spend, tokens) data to [Azure Data Lake](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction). This makes it easy to consume usage data on other services (eg. Databricks)
[Start here](https://docs.litellm.ai/docs/logging/azure_data_lake_storage)
## Docker Run LiteLLM
```shell
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.55.8-stable
```
## Azure Blob Storage Logger

View file

@ -11,7 +11,7 @@ LiteLLM supports Embedding Models on Triton Inference Servers
| Provider Route on LiteLLM | `triton/` | | Provider Route on LiteLLM | `triton/` |
| Supported Operations | `/chat/completion`, `/completion`, `/embedding` | | Supported Operations | `/chat/completion`, `/completion`, `/embedding` |
| Supported Triton endpoints | `/infer`, `/generate`, `/embeddings` | | Supported Triton endpoints | `/infer`, `/generate`, `/embeddings` |
| Link to Provider Doc | [Triton Inference Server ↗](https://github.com/michaelfeil/infinity) | | Link to Provider Doc | [Triton Inference Server ↗](https://developer.nvidia.com/triton-inference-server) |
## Triton `/generate` - Chat Completion ## Triton `/generate` - Chat Completion