From 291229ee46a2d6992d5f538faa2d3b82a1ee96e8 Mon Sep 17 00:00:00 2001 From: Ishaan Jaff Date: Sat, 21 Dec 2024 20:51:39 -0800 Subject: [PATCH] docs add 1.55.8 changelog --- docs/my-website/blog/v1.55.8-stable/index.md | 30 ++++++++++++++++--- .../docs/providers/triton-inference-server.md | 2 +- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/docs/my-website/blog/v1.55.8-stable/index.md b/docs/my-website/blog/v1.55.8-stable/index.md index 9f49fb5c84..f1809cd2f7 100644 --- a/docs/my-website/blog/v1.55.8-stable/index.md +++ b/docs/my-website/blog/v1.55.8-stable/index.md @@ -2,12 +2,34 @@ A new LiteLLM Stable release just went out. Here are 5 updates since v1.52.2-stable. -## Langfuse Prompt Management: +## Langfuse Prompt Management + +This makes it easy to run experiments or change the specific models `gpt-4o` to `gpt-4o-mini` on Langfuse, instead of making changes in your applications. [Start here](https://docs.litellm.ai/docs/proxy/prompt_management) ## Control fallback prompts client-side -## Triton /infer support +> Claude prompts are different than OpenAI -## Infinity Rerank Models +Pass in prompts specific to model when doing fallbacks. [Start here](https://docs.litellm.ai/docs/proxy/reliability#control-fallback-prompts) + + +## New Providers / Models + +- [NVIDIA Triton](https://developer.nvidia.com/triton-inference-server) `/infer` endpoint. [Start here](https://docs.litellm.ai/docs/providers/triton-inference-server) +- [Infinity](https://github.com/michaelfeil/infinity) Rerank Models [Start here](https://docs.litellm.ai/docs/providers/infinity) + + +## ✨ Azure Data Lake Storage Support + +Send LLM usage (spend, tokens) data to [Azure Data Lake](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction). This makes it easy to consume usage data on other services (eg. Databricks) + [Start here](https://docs.litellm.ai/docs/logging/azure_data_lake_storage) + +## Docker Run LiteLLM + +```shell +docker run \ +-e STORE_MODEL_IN_DB=True \ +-p 4000:4000 \ +ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.55.8-stable +``` -## Azure Blob Storage Logger diff --git a/docs/my-website/docs/providers/triton-inference-server.md b/docs/my-website/docs/providers/triton-inference-server.md index 48ce9b9738..1d3789fe8a 100644 --- a/docs/my-website/docs/providers/triton-inference-server.md +++ b/docs/my-website/docs/providers/triton-inference-server.md @@ -11,7 +11,7 @@ LiteLLM supports Embedding Models on Triton Inference Servers | Provider Route on LiteLLM | `triton/` | | Supported Operations | `/chat/completion`, `/completion`, `/embedding` | | Supported Triton endpoints | `/infer`, `/generate`, `/embeddings` | -| Link to Provider Doc | [Triton Inference Server ↗](https://github.com/michaelfeil/infinity) | +| Link to Provider Doc | [Triton Inference Server ↗](https://developer.nvidia.com/triton-inference-server) | ## Triton `/generate` - Chat Completion