diff --git a/docs/my-website/docs/pass_through/cohere.md b/docs/my-website/docs/pass_through/cohere.md index a74ef695ef..227ff5777a 100644 --- a/docs/my-website/docs/pass_through/cohere.md +++ b/docs/my-website/docs/pass_through/cohere.md @@ -4,7 +4,7 @@ Pass-through endpoints for Cohere - call provider-specific endpoint, in native f | Feature | Supported | Notes | |-------|-------|-------| -| Cost Tracking | ✅ | works across all integrations | +| Cost Tracking | ✅ | Supported for `/v1/chat`, and `/v2/chat` | | Logging | ✅ | works across all integrations | | End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) | | Streaming | ✅ | | diff --git a/docs/my-website/docs/providers/vllm.md b/docs/my-website/docs/providers/vllm.md index b5987167ec..5c8233b056 100644 --- a/docs/my-website/docs/providers/vllm.md +++ b/docs/my-website/docs/providers/vllm.md @@ -161,6 +161,120 @@ curl -L -X POST 'http://0.0.0.0:4000/embeddings' \ Example Implementation from VLLM [here](https://github.com/vllm-project/vllm/pull/10020) + + + +Use this to send a video url to VLLM + Gemini in the same format, using OpenAI's `files` message type. + +There are two ways to send a video url to VLLM: + +1. Pass the video url directly + +``` +{"type": "file", "file": {"file_id": video_url}}, +``` + +2. Pass the video data as base64 + +``` +{"type": "file", "file": {"file_data": f"data:video/mp4;base64,{video_data_base64}"}} +``` + + + + +```python +from litellm import completion + +messages=[ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "Summarize the following video" + }, + { + "type": "file", + "file": { + "file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" + } + } + ] + } +] + +# call vllm +os.environ["HOSTED_VLLM_API_BASE"] = "https://hosted-vllm-api.co" +os.environ["HOSTED_VLLM_API_KEY"] = "" # [optional], if your VLLM server requires an API key +response = completion( + model="hosted_vllm/qwen", # pass the vllm model name + messages=messages, +) + +# call gemini +os.environ["GEMINI_API_KEY"] = "your-gemini-api-key" +response = completion( + model="gemini/gemini-1.5-flash", # pass the gemini model name + messages=messages, +) + +print(response) +``` + + + + +1. Setup config.yaml + +```yaml +model_list: + - model_name: my-model + litellm_params: + model: hosted_vllm/qwen # add hosted_vllm/ prefix to route as OpenAI provider + api_base: https://hosted-vllm-api.co # add api base for OpenAI compatible provider + - model_name: my-gemini-model + litellm_params: + model: gemini/gemini-1.5-flash # add gemini/ prefix to route as Google AI Studio provider + api_key: os.environ/GEMINI_API_KEY +``` + +2. Start the proxy + +```bash +$ litellm --config /path/to/config.yaml + +# RUNNING on http://0.0.0.0:4000 +``` + +3. Test it! + +```bash +curl -X POST http://0.0.0.0:4000/chat/completions \ +-H "Authorization: Bearer sk-1234" \ +-H "Content-Type: application/json" \ +-d '{ + "model": "my-model", + "messages": [ + {"role": "user", "content": + [ + {"type": "text", "text": "Summarize the following video"}, + {"type": "file", "file": {"file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}} + ] + } + ] +}' +``` + + + + + + + + +Use this to send a video url to VLLM in it's native message format (`video_url`). + There are two ways to send a video url to VLLM: 1. Pass the video url directly @@ -249,6 +363,10 @@ curl -X POST http://0.0.0.0:4000/chat/completions \ + + + + ## (Deprecated) for `vllm pip package` ### Using - `litellm.completion` diff --git a/docs/my-website/release_notes/v1.67.0-stable/index.md b/docs/my-website/release_notes/v1.67.0-stable/index.md index caabee6f1e..81bcfd7b2f 100644 --- a/docs/my-website/release_notes/v1.67.0-stable/index.md +++ b/docs/my-website/release_notes/v1.67.0-stable/index.md @@ -27,7 +27,7 @@ hide_table_of_contents: false - **Anthropic** 1. redacted message thinking support - [Get Started](../../docs/providers/anthropic#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10129) - **Cohere** - 1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/9997) + 1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [Get Started](../../docs/pass_through/cohere), [PR](https://github.com/BerriAI/litellm/pull/9997) - **Azure** 1. Support azure tenant_id/client_id env vars - [Get Started](../../docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret), [PR](https://github.com/BerriAI/litellm/pull/9993) 2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993)