Litellm dev 01 27 2025 p3 (#8047)

* docs(reliability.md): add doc on disabling fallbacks per request

* feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param

Allows setting dynamic model timeouts from vercel's AI sdk

* test(test_proxy_server.py): add simple unit test for reading request timeout

* test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read

* feat(main.py): support passing metadata to openai in preview

Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371

* fix(main.py): fix passing openai metadata

* docs(request_headers.md): document new request headers

* build: Merge branch 'main' into litellm_dev_01_27_2025_p3

* test: loosen test
This commit is contained in:
Krish Dholakia 2025-01-28 18:01:27 -08:00 committed by GitHub
parent 9c20c69915
commit d9eb8f42ff
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 187 additions and 3 deletions

View file

@ -1007,7 +1007,34 @@ curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
}'
```
### Disable Fallbacks per key
### Disable Fallbacks (Per Request/Key)
<Tabs>
<TabItem value="request" label="Per Request">
You can disable fallbacks per key by setting `disable_fallbacks: true` in your request body.
```bash
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"messages": [
{
"role": "user",
"content": "List 5 important events in the XIX century"
}
],
"model": "gpt-3.5-turbo",
"disable_fallbacks": true # 👈 DISABLE FALLBACKS
}'
```
</TabItem>
<TabItem value="key" label="Per Key">
You can disable fallbacks per key by setting `disable_fallbacks: true` in your key metadata.
@ -1020,4 +1047,7 @@ curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
"disable_fallbacks": true
}
}'
```
```
</TabItem>
</Tabs>

View file

@ -0,0 +1,12 @@
# Request Headers
Special headers that are supported by LiteLLM.
## LiteLLM Headers
`x-litellm-timeout` Optional[float]: The timeout for the request in seconds.
## Anthropic Headers
`anthropic-version` Optional[str]: The version of the Anthropic API to use.
`anthropic-beta` Optional[str]: The beta version of the Anthropic API to use.

View file

@ -66,6 +66,7 @@ const sidebars = {
"proxy/user_keys",
"proxy/clientside_auth",
"proxy/response_headers",
"proxy/request_headers",
],
},
{