mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-25 18:54:30 +00:00
docs(vllm.md): update vllm doc to show file message type support
This commit is contained in:
parent
68d7a5da04
commit
df97c5faae
3 changed files with 120 additions and 2 deletions
|
@ -4,7 +4,7 @@ Pass-through endpoints for Cohere - call provider-specific endpoint, in native f
|
||||||
|
|
||||||
| Feature | Supported | Notes |
|
| Feature | Supported | Notes |
|
||||||
|-------|-------|-------|
|
|-------|-------|-------|
|
||||||
| Cost Tracking | ✅ | works across all integrations |
|
| Cost Tracking | ✅ | Supported for `/v1/chat`, and `/v2/chat` |
|
||||||
| Logging | ✅ | works across all integrations |
|
| Logging | ✅ | works across all integrations |
|
||||||
| End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) |
|
| End-user Tracking | ❌ | [Tell us if you need this](https://github.com/BerriAI/litellm/issues/new) |
|
||||||
| Streaming | ✅ | |
|
| Streaming | ✅ | |
|
||||||
|
|
|
@ -161,6 +161,120 @@ curl -L -X POST 'http://0.0.0.0:4000/embeddings' \
|
||||||
|
|
||||||
Example Implementation from VLLM [here](https://github.com/vllm-project/vllm/pull/10020)
|
Example Implementation from VLLM [here](https://github.com/vllm-project/vllm/pull/10020)
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="files_message" label="(Unified) Files Message">
|
||||||
|
|
||||||
|
Use this to send a video url to VLLM + Gemini in the same format, using OpenAI's `files` message type.
|
||||||
|
|
||||||
|
There are two ways to send a video url to VLLM:
|
||||||
|
|
||||||
|
1. Pass the video url directly
|
||||||
|
|
||||||
|
```
|
||||||
|
{"type": "file", "file": {"file_id": video_url}},
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Pass the video data as base64
|
||||||
|
|
||||||
|
```
|
||||||
|
{"type": "file", "file": {"file_data": f"data:video/mp4;base64,{video_data_base64}"}}
|
||||||
|
```
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="sdk" label="SDK">
|
||||||
|
|
||||||
|
```python
|
||||||
|
from litellm import completion
|
||||||
|
|
||||||
|
messages=[
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [
|
||||||
|
{
|
||||||
|
"type": "text",
|
||||||
|
"text": "Summarize the following video"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "file",
|
||||||
|
"file": {
|
||||||
|
"file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
# call vllm
|
||||||
|
os.environ["HOSTED_VLLM_API_BASE"] = "https://hosted-vllm-api.co"
|
||||||
|
os.environ["HOSTED_VLLM_API_KEY"] = "" # [optional], if your VLLM server requires an API key
|
||||||
|
response = completion(
|
||||||
|
model="hosted_vllm/qwen", # pass the vllm model name
|
||||||
|
messages=messages,
|
||||||
|
)
|
||||||
|
|
||||||
|
# call gemini
|
||||||
|
os.environ["GEMINI_API_KEY"] = "your-gemini-api-key"
|
||||||
|
response = completion(
|
||||||
|
model="gemini/gemini-1.5-flash", # pass the gemini model name
|
||||||
|
messages=messages,
|
||||||
|
)
|
||||||
|
|
||||||
|
print(response)
|
||||||
|
```
|
||||||
|
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="proxy" label="PROXY">
|
||||||
|
|
||||||
|
1. Setup config.yaml
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: my-model
|
||||||
|
litellm_params:
|
||||||
|
model: hosted_vllm/qwen # add hosted_vllm/ prefix to route as OpenAI provider
|
||||||
|
api_base: https://hosted-vllm-api.co # add api base for OpenAI compatible provider
|
||||||
|
- model_name: my-gemini-model
|
||||||
|
litellm_params:
|
||||||
|
model: gemini/gemini-1.5-flash # add gemini/ prefix to route as Google AI Studio provider
|
||||||
|
api_key: os.environ/GEMINI_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Start the proxy
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ litellm --config /path/to/config.yaml
|
||||||
|
|
||||||
|
# RUNNING on http://0.0.0.0:4000
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Test it!
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST http://0.0.0.0:4000/chat/completions \
|
||||||
|
-H "Authorization: Bearer sk-1234" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "my-model",
|
||||||
|
"messages": [
|
||||||
|
{"role": "user", "content":
|
||||||
|
[
|
||||||
|
{"type": "text", "text": "Summarize the following video"},
|
||||||
|
{"type": "file", "file": {"file_id": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="video_url" label="(VLLM-specific) Video Message">
|
||||||
|
|
||||||
|
Use this to send a video url to VLLM in it's native message format (`video_url`).
|
||||||
|
|
||||||
There are two ways to send a video url to VLLM:
|
There are two ways to send a video url to VLLM:
|
||||||
|
|
||||||
1. Pass the video url directly
|
1. Pass the video url directly
|
||||||
|
@ -249,6 +363,10 @@ curl -X POST http://0.0.0.0:4000/chat/completions \
|
||||||
</Tabs>
|
</Tabs>
|
||||||
|
|
||||||
|
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
|
||||||
## (Deprecated) for `vllm pip package`
|
## (Deprecated) for `vllm pip package`
|
||||||
### Using - `litellm.completion`
|
### Using - `litellm.completion`
|
||||||
|
|
||||||
|
|
|
@ -27,7 +27,7 @@ hide_table_of_contents: false
|
||||||
- **Anthropic**
|
- **Anthropic**
|
||||||
1. redacted message thinking support - [Get Started](../../docs/providers/anthropic#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10129)
|
1. redacted message thinking support - [Get Started](../../docs/providers/anthropic#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10129)
|
||||||
- **Cohere**
|
- **Cohere**
|
||||||
1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [ADD DOCS HERE], [PR](https://github.com/BerriAI/litellm/pull/9997)
|
1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [Get Started](../../docs/pass_through/cohere), [PR](https://github.com/BerriAI/litellm/pull/9997)
|
||||||
- **Azure**
|
- **Azure**
|
||||||
1. Support azure tenant_id/client_id env vars - [Get Started](../../docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret), [PR](https://github.com/BerriAI/litellm/pull/9993)
|
1. Support azure tenant_id/client_id env vars - [Get Started](../../docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret), [PR](https://github.com/BerriAI/litellm/pull/9993)
|
||||||
2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993)
|
2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993)
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue