docs(vllm.md): update vllm doc to show file message type support

2025-04-25 18:54:30 +00:00 · 2025-04-19 14:56:36 -07:00 · 2025-04-19 14:56:36 -07:00 · df97c5faae
commit df97c5faae
parent 68d7a5da04
3 changed files with 120 additions and 2 deletions
--- a/docs/my-website/docs/pass_through/cohere.md
+++ b/docs/my-website/docs/pass_through/cohere.md
@ -4,7 +4,7 @@ Pass-through endpoints for Cohere - call provider-specific endpoint, in native f
 | Feature | Supported | Notes | 
 |-------|-------|-------|
-| Cost Tracking | ✅ | works across all integrations |
+| Cost Tracking | ✅ | Supported for `/v1/chat`, and `/v2/chat` |
 | Logging | ✅ | works across all integrations |
 | End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) |
 | Streaming | ✅ | |
--- a/docs/my-website/docs/providers/vllm.md
+++ b/docs/my-website/docs/providers/vllm.md
@ -161,6 +161,120 @@ curl -L -X POST 'http://0.0.0.0:4000/embeddings' \
 Example Implementation from VLLM [here](https://github.com/vllm-project/vllm/pull/10020)
 <Tabs>
 <TabItem value="files_message" label="(Unified) Files Message">
 Use this to send a video url to VLLM + Gemini in the same format, using OpenAI's `files` message type.
 There are two ways to send a video url to VLLM:
 1. Pass the video url directly
 ```
 {"type": "file", "file": {"file_id": video_url}},
 ```
 2. Pass the video data as base64
 ```
 {"type": "file", "file": {"file_data": f"data:video/mp4;base64,{video_data_base64}"}}
 ```
 <Tabs>
 <TabItem value="sdk" label="SDK">
 ```python
 from litellm import completion
 messages=[
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Summarize the following video"
            },
            {
                "type": "file",
                "file": {
                    "file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
                }
            }
        ]
    }
 ]
 # call vllm 
 os.environ["HOSTED_VLLM_API_BASE"] = "https://hosted-vllm-api.co"
 os.environ["HOSTED_VLLM_API_KEY"] = "" # [optional], if your VLLM server requires an API key
 response = completion(
    model="hosted_vllm/qwen", # pass the vllm model name
    messages=messages,
 )
 # call gemini 
 os.environ["GEMINI_API_KEY"] = "your-gemini-api-key"
 response = completion(
    model="gemini/gemini-1.5-flash", # pass the gemini model name
    messages=messages,
 )
 print(response)
 ```
 </TabItem>
 <TabItem value="proxy" label="PROXY">
 1. Setup config.yaml
 ```yaml
 model_list:
    - model_name: my-model
      litellm_params:
        model: hosted_vllm/qwen  # add hosted_vllm/ prefix to route as OpenAI provider
        api_base: https://hosted-vllm-api.co      # add api base for OpenAI compatible provider
    - model_name: my-gemini-model
      litellm_params:
        model: gemini/gemini-1.5-flash  # add gemini/ prefix to route as Google AI Studio provider
        api_key: os.environ/GEMINI_API_KEY
 ```
 2. Start the proxy 
 ```bash
 $ litellm --config /path/to/config.yaml
 # RUNNING on http://0.0.0.0:4000
 ```
 3. Test it! 
 ```bash
 curl -X POST http://0.0.0.0:4000/chat/completions \
 -H "Authorization: Bearer sk-1234" \
 -H "Content-Type: application/json" \
 -d '{
    "model": "my-model",
    "messages": [
        {"role": "user", "content": 
            [
                {"type": "text", "text": "Summarize the following video"},
                {"type": "file", "file": {"file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}}
            ]
        }
    ]
 }'
 ```
 </TabItem>
 </Tabs>
 </TabItem>
 <TabItem value="video_url" label="(VLLM-specific) Video Message">
 Use this to send a video url to VLLM in it's native message format (`video_url`).
 There are two ways to send a video url to VLLM:
 1. Pass the video url directly
@ -249,6 +363,10 @@ curl -X POST http://0.0.0.0:4000/chat/completions \
 </Tabs>
 </TabItem>
 </Tabs>
 ## (Deprecated) for `vllm pip package` 
 ### Using - `litellm.completion`
--- a/docs/my-website/release_notes/v1.67.0-stable/index.md
+++ b/docs/my-website/release_notes/v1.67.0-stable/index.md
@ -27,7 +27,7 @@ hide_table_of_contents: false
 - **Anthropic**
    1. redacted message thinking support - [Get Started](../../docs/providers/anthropic#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10129)
 - **Cohere**
-    1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/9997)
+    1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [Get Started](../../docs/pass_through/cohere), [PR](https://github.com/BerriAI/litellm/pull/9997)
 - **Azure**
    1. Support azure tenant_id/client_id env vars - [Get Started](../../docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret), [PR](https://github.com/BerriAI/litellm/pull/9993)
    2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993)