diff --git a/docs/my-website/docs/pass_through/mistral.md b/docs/my-website/docs/pass_through/mistral.md new file mode 100644 index 0000000000..ee7ca800c4 --- /dev/null +++ b/docs/my-website/docs/pass_through/mistral.md @@ -0,0 +1,217 @@ +# Mistral + +Pass-through endpoints for Mistral - call provider-specific endpoint, in native format (no translation). + +| Feature | Supported | Notes | +|-------|-------|-------| +| Cost Tracking | ❌ | Not supported | +| Logging | ✅ | works across all integrations | +| End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) | +| Streaming | ✅ | | + +Just replace `https://api.mistral.ai/v1` with `LITELLM_PROXY_BASE_URL/mistral` 🚀 + +#### **Example Usage** + +```bash +curl -L -X POST 'http://0.0.0.0:4000/mistral/v1/ocr' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234' \ +-d '{ + "model": "mistral-ocr-latest", + "document": { + "type": "image_url", + "image_url": "https://raw.githubusercontent.com/mistralai/cookbook/refs/heads/main/mistral/ocr/receipt.png" + } + +}' +``` + +Supports **ALL** Mistral Endpoints (including streaming). + +## Quick Start + +Let's call the Mistral [`/chat/completions` endpoint](https://docs.mistral.ai/api/#tag/chat/operation/chat_completion_v1_chat_completions_post) + +1. Add MISTRAL_API_KEY to your environment + +```bash +export MISTRAL_API_KEY="sk-1234" +``` + +2. Start LiteLLM Proxy + +```bash +litellm + +# RUNNING on http://0.0.0.0:4000 +``` + +3. Test it! + +Let's call the Mistral `/ocr` endpoint + +```bash +curl -L -X POST 'http://0.0.0.0:4000/mistral/v1/ocr' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234' \ +-d '{ + "model": "mistral-ocr-latest", + "document": { + "type": "image_url", + "image_url": "https://raw.githubusercontent.com/mistralai/cookbook/refs/heads/main/mistral/ocr/receipt.png" + } + +}' +``` + + +## Examples + +Anything after `http://0.0.0.0:4000/mistral` is treated as a provider-specific route, and handled accordingly. + +Key Changes: + +| **Original Endpoint** | **Replace With** | +|------------------------------------------------------|-----------------------------------| +| `https://api.mistral.ai/v1` | `http://0.0.0.0:4000/mistral` (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000") | +| `bearer $MISTRAL_API_KEY` | `bearer anything` (use `bearer LITELLM_VIRTUAL_KEY` if Virtual Keys are setup on proxy) | + + +### **Example 1: OCR endpoint** + +#### LiteLLM Proxy Call + +```bash +curl -L -X POST 'http://0.0.0.0:4000/mistral/v1/ocr' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer $LITELLM_API_KEY' \ +-d '{ + "model": "mistral-ocr-latest", + "document": { + "type": "image_url", + "image_url": "https://raw.githubusercontent.com/mistralai/cookbook/refs/heads/main/mistral/ocr/receipt.png" + } +}' +``` + + +#### Direct Mistral API Call + +```bash +curl https://api.mistral.ai/v1/ocr \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer ${MISTRAL_API_KEY}" \ + -d '{ + "model": "mistral-ocr-latest", + "document": { + "type": "document_url", + "document_url": "https://arxiv.org/pdf/2201.04234" + }, + "include_image_base64": true + }' +``` + +### **Example 2: Chat API** + +#### LiteLLM Proxy Call + +```bash +curl -L -X POST 'http://0.0.0.0:4000/mistral/v1/chat/completions' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer $LITELLM_VIRTUAL_KEY' \ +-d '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "mistral-large-latest", +}' +``` + +#### Direct Mistral API Call + +```bash +curl -L -X POST 'https://api.mistral.ai/v1/chat/completions' \ +-H 'Content-Type: application/json' \ +-d '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "mistral-large-latest", +}' +``` + + +## Advanced - Use with Virtual Keys + +Pre-requisites +- [Setup proxy with DB](../proxy/virtual_keys.md#setup) + +Use this, to avoid giving developers the raw Mistral API key, but still letting them use Mistral endpoints. + +### Usage + +1. Setup environment + +```bash +export DATABASE_URL="" +export LITELLM_MASTER_KEY="" +export MISTRAL_API_BASE="" +``` + +```bash +litellm + +# RUNNING on http://0.0.0.0:4000 +``` + +2. Generate virtual key + +```bash +curl -X POST 'http://0.0.0.0:4000/key/generate' \ +-H 'Authorization: Bearer sk-1234' \ +-H 'Content-Type: application/json' \ +-d '{}' +``` + +Expected Response + +```bash +{ + ... + "key": "sk-1234ewknldferwedojwojw" +} +``` + +3. Test it! + + +```bash +curl -L -X POST 'http://0.0.0.0:4000/mistral/v1/chat/completions' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234ewknldferwedojwojw' \ + --data '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "qwen2.5-7b-instruct", +}' +``` \ No newline at end of file diff --git a/docs/my-website/docs/pass_through/vllm.md b/docs/my-website/docs/pass_through/vllm.md new file mode 100644 index 0000000000..b267622948 --- /dev/null +++ b/docs/my-website/docs/pass_through/vllm.md @@ -0,0 +1,185 @@ +# VLLM + +Pass-through endpoints for VLLM - call provider-specific endpoint, in native format (no translation). + +| Feature | Supported | Notes | +|-------|-------|-------| +| Cost Tracking | ❌ | Not supported | +| Logging | ✅ | works across all integrations | +| End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) | +| Streaming | ✅ | | + +Just replace `https://my-vllm-server.com` with `LITELLM_PROXY_BASE_URL/vllm` 🚀 + +#### **Example Usage** + +```bash +curl -L -X GET 'http://0.0.0.0:4000/vllm/metrics' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234' \ +``` + +Supports **ALL** VLLM Endpoints (including streaming). + +## Quick Start + +Let's call the VLLM [`/metrics` endpoint](https://vllm.readthedocs.io/en/latest/api_reference/api_reference.html) + +1. Add HOSTED VLLM API BASE to your environment + +```bash +export HOSTED_VLLM_API_BASE="https://my-vllm-server.com" +``` + +2. Start LiteLLM Proxy + +```bash +litellm + +# RUNNING on http://0.0.0.0:4000 +``` + +3. Test it! + +Let's call the VLLM `/metrics` endpoint + +```bash +curl -L -X GET 'http://0.0.0.0:4000/vllm/metrics' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234' \ +``` + + +## Examples + +Anything after `http://0.0.0.0:4000/vllm` is treated as a provider-specific route, and handled accordingly. + +Key Changes: + +| **Original Endpoint** | **Replace With** | +|------------------------------------------------------|-----------------------------------| +| `https://my-vllm-server.com` | `http://0.0.0.0:4000/vllm` (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000") | +| `bearer $VLLM_API_KEY` | `bearer anything` (use `bearer LITELLM_VIRTUAL_KEY` if Virtual Keys are setup on proxy) | + + +### **Example 1: Metrics endpoint** + +#### LiteLLM Proxy Call + +```bash +curl -L -X GET 'http://0.0.0.0:4000/vllm/metrics' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer $LITELLM_VIRTUAL_KEY' \ +``` + + +#### Direct VLLM API Call + +```bash +curl -L -X GET 'https://my-vllm-server.com/metrics' \ +-H 'Content-Type: application/json' \ +``` + +### **Example 2: Chat API** + +#### LiteLLM Proxy Call + +```bash +curl -L -X POST 'http://0.0.0.0:4000/vllm/chat/completions' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer $LITELLM_VIRTUAL_KEY' \ +-d '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "qwen2.5-7b-instruct", +}' +``` + +#### Direct VLLM API Call + +```bash +curl -L -X POST 'https://my-vllm-server.com/chat/completions' \ +-H 'Content-Type: application/json' \ +-d '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "qwen2.5-7b-instruct", +}' +``` + + +## Advanced - Use with Virtual Keys + +Pre-requisites +- [Setup proxy with DB](../proxy/virtual_keys.md#setup) + +Use this, to avoid giving developers the raw Cohere API key, but still letting them use Cohere endpoints. + +### Usage + +1. Setup environment + +```bash +export DATABASE_URL="" +export LITELLM_MASTER_KEY="" +export HOSTED_VLLM_API_BASE="" +``` + +```bash +litellm + +# RUNNING on http://0.0.0.0:4000 +``` + +2. Generate virtual key + +```bash +curl -X POST 'http://0.0.0.0:4000/key/generate' \ +-H 'Authorization: Bearer sk-1234' \ +-H 'Content-Type: application/json' \ +-d '{}' +``` + +Expected Response + +```bash +{ + ... + "key": "sk-1234ewknldferwedojwojw" +} +``` + +3. Test it! + + +```bash +curl -L -X POST 'http://0.0.0.0:4000/vllm/chat/completions' \ +-H 'Content-Type: application/json' \ +-H 'Authorization: Bearer sk-1234ewknldferwedojwojw' \ + --data '{ + "messages": [ + { + "role": "user", + "content": "I am going to Paris, what should I see?" + } + ], + "max_tokens": 2048, + "temperature": 0.8, + "top_p": 0.1, + "model": "qwen2.5-7b-instruct", +}' +``` \ No newline at end of file diff --git a/docs/my-website/release_notes/v1.67.0-stable/index.md b/docs/my-website/release_notes/v1.67.0-stable/index.md index 81bcfd7b2f..13bc13a56d 100644 --- a/docs/my-website/release_notes/v1.67.0-stable/index.md +++ b/docs/my-website/release_notes/v1.67.0-stable/index.md @@ -33,10 +33,10 @@ hide_table_of_contents: false 2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993) 3. Add gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-mini, o4-mini pricing - **VLLM** - 1. Files - Support 'file' message type for VLLM video url's - [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/10129) - 2. Passthrough - new `/vllm/` passthrough endpoint support [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/10002) + 1. Files - Support 'file' message type for VLLM video url's - [Get Started](../../docs/providers/vllm#send-video-url-to-vllm), [PR](https://github.com/BerriAI/litellm/pull/10129) + 2. Passthrough - new `/vllm/` passthrough endpoint support [Get Started](../../docs/pass_through/vllm), [PR](https://github.com/BerriAI/litellm/pull/10002) - **Mistral** - 1. new `/mistral` passthrough endpoint support [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/10002) + 1. new `/mistral` passthrough endpoint support [Get Started](../../docs/pass_through/mistral), [PR](https://github.com/BerriAI/litellm/pull/10002) - **AWS** 1. New mapped bedrock regions - [PR](https://github.com/BerriAI/litellm/pull/9430) - **VertexAI / Google AI Studio** diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index bc9182305a..2128d03a39 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -330,6 +330,8 @@ const sidebars = { "pass_through/vertex_ai", "pass_through/google_ai_studio", "pass_through/cohere", + "pass_through/vllm", + "pass_through/mistral", "pass_through/openai_passthrough", "pass_through/anthropic_completion", "pass_through/bedrock",