forked from phoenix/litellm-mirror
docs(docker_quick_start.md): add new quick start doc for litellm proxy
This commit is contained in:
parent
4b7ceade64
commit
601945d114
5 changed files with 366 additions and 9 deletions
|
@ -399,6 +399,8 @@ The proxy provides:
|
||||||
|
|
||||||
### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)
|
### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)
|
||||||
|
|
||||||
|
Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md)
|
||||||
|
|
||||||
### Quick Start Proxy - CLI
|
### Quick Start Proxy - CLI
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
@ -433,4 +435,5 @@ print(response)
|
||||||
|
|
||||||
- [exception mapping](./exception_mapping.md)
|
- [exception mapping](./exception_mapping.md)
|
||||||
- [retries + model fallbacks for completion()](./completion/reliable_completions.md)
|
- [retries + model fallbacks for completion()](./completion/reliable_completions.md)
|
||||||
- [proxy virtual keys & spend management](./tutorials/fallbacks.md)
|
- [proxy virtual keys & spend management](./proxy/virtual_keys.md)
|
||||||
|
- [E2E Tutorial for LiteLLM Proxy Server](./proxy/docker_quick_start.md)
|
||||||
|
|
355
docs/my-website/docs/proxy/docker_quick_start.md
Normal file
355
docs/my-website/docs/proxy/docker_quick_start.md
Normal file
|
@ -0,0 +1,355 @@
|
||||||
|
# Getting Started - E2E Tutorial
|
||||||
|
|
||||||
|
End-to-End tutorial for LiteLLM Proxy to:
|
||||||
|
- Add an Azure OpenAI model
|
||||||
|
- Make a successful /chat/completion call
|
||||||
|
- Generate a virtual key
|
||||||
|
- Set RPM limit on virtual key
|
||||||
|
|
||||||
|
|
||||||
|
## Pre-Requisites
|
||||||
|
|
||||||
|
- Install LiteLLM Docker Image
|
||||||
|
|
||||||
|
```
|
||||||
|
docker pull ghcr.io/berriai/litellm:main-latest
|
||||||
|
```
|
||||||
|
|
||||||
|
[**See all docker images**](https://github.com/orgs/BerriAI/packages)
|
||||||
|
|
||||||
|
## 1. Add a model
|
||||||
|
|
||||||
|
Control LiteLLM Proxy with a config.yaml file.
|
||||||
|
|
||||||
|
Setup your config.yaml with your azure model.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: gpt-3.5-turbo
|
||||||
|
litellm_params:
|
||||||
|
model: azure/my_azure_deployment
|
||||||
|
api_base: os.environ/AZURE_API_BASE
|
||||||
|
api_key: "os.environ/AZURE_API_KEY"
|
||||||
|
api_version: "2024-07-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
|
||||||
|
```
|
||||||
|
---
|
||||||
|
|
||||||
|
### Model List Specification
|
||||||
|
|
||||||
|
- **`model_name`** (`str`) - This field should contain the name of the model as received.
|
||||||
|
- **`litellm_params`** (`dict`) [See All LiteLLM Params](https://github.com/BerriAI/litellm/blob/559a6ad826b5daef41565f54f06c739c8c068b28/litellm/types/router.py#L222)
|
||||||
|
- **`model`** (`str`) - Specifies the model name to be sent to `litellm.acompletion` / `litellm.aembedding`, etc. This is the identifier used by LiteLLM to route to the correct model + provider logic on the backend.
|
||||||
|
- **`api_key`** (`str`) - The API key required for authentication. It can be retrieved from an environment variable using `os.environ/`.
|
||||||
|
- **`api_base`** (`str`) - The API base for your azure deployment.
|
||||||
|
- **`api_version`** (`str`) - The API Version to use when calling Azure's OpenAI API. Get the latest Inference API version [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation?source=recommendations#latest-preview-api-releases).
|
||||||
|
|
||||||
|
|
||||||
|
### Useful Links
|
||||||
|
- [**All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)**](../providers/)
|
||||||
|
- [**Full Config.Yaml Spec**](./configs.md)
|
||||||
|
- [**Pass provider-specific params**](../completion/provider_specific_params.md#proxy-usage)
|
||||||
|
|
||||||
|
|
||||||
|
## 2. Make a successful /chat/completion call
|
||||||
|
|
||||||
|
LiteLLM Proxy is 100% OpenAI-compatible. Test your azure model via the `/chat/completions` route.
|
||||||
|
|
||||||
|
### 2.1 Start Proxy
|
||||||
|
|
||||||
|
Save your config.yaml from step 1. as `litellm_config.yaml`.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run \
|
||||||
|
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
|
||||||
|
-e AZURE_API_KEY=d6*********** \
|
||||||
|
-e AZURE_API_BASE=https://openai-***********/ \
|
||||||
|
-p 4000:4000 \
|
||||||
|
ghcr.io/berriai/litellm:main-latest \
|
||||||
|
--config /app/config.yaml --detailed_debug
|
||||||
|
|
||||||
|
# RUNNING on http://0.0.0.0:4000
|
||||||
|
```
|
||||||
|
|
||||||
|
Confirm your config.yaml got mounted correctly
|
||||||
|
|
||||||
|
```bash
|
||||||
|
Loaded config YAML (api_key and environment_variables are not shown):
|
||||||
|
{
|
||||||
|
"model_list": [
|
||||||
|
{
|
||||||
|
"model_name ...
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2.2 Make Call
|
||||||
|
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-H 'Authorization: Bearer sk-1234' \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "how can I solve 8x + 7 = -23"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected Response**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{
|
||||||
|
"id": "chatcmpl-2076f062-3095-4052-a520-7c321c115c68",
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"finish_reason": "stop",
|
||||||
|
"index": 0,
|
||||||
|
"message": {
|
||||||
|
"content": "I am gpt-3.5-turbo",
|
||||||
|
"role": "assistant",
|
||||||
|
"tool_calls": null,
|
||||||
|
"function_call": null
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"created": 1724962831,
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"object": "chat.completion",
|
||||||
|
"system_fingerprint": null,
|
||||||
|
"usage": {
|
||||||
|
"completion_tokens": 20,
|
||||||
|
"prompt_tokens": 10,
|
||||||
|
"total_tokens": 30
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Useful Links
|
||||||
|
- [All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)](../providers/)
|
||||||
|
- [Call LiteLLM Proxy via OpenAI SDK, Langchain, etc.](./user_keys.md#request-format)
|
||||||
|
- [All API Endpoints Swagger](https://litellm-api.up.railway.app/#/chat%2Fcompletions)
|
||||||
|
- [Other/Non-Chat Completion Endpoints](../embedding/supported_embedding.md)
|
||||||
|
- [Pass-through for VertexAI, Bedrock, etc.](../pass_through/vertex_ai.md)
|
||||||
|
|
||||||
|
## 3. Generate a virtual key
|
||||||
|
|
||||||
|
Track Spend, and control model access via virtual keys for the proxy
|
||||||
|
|
||||||
|
### 3.1 Set up a Database
|
||||||
|
|
||||||
|
**Requirements**
|
||||||
|
- Need a postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), etc)
|
||||||
|
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: gpt-3.5-turbo
|
||||||
|
litellm_params:
|
||||||
|
model: azure/my_azure_deployment
|
||||||
|
api_base: os.environ/AZURE_API_BASE
|
||||||
|
api_key: "os.environ/AZURE_API_KEY"
|
||||||
|
api_version: "2024-07-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
|
||||||
|
|
||||||
|
general_settings:
|
||||||
|
master_key: sk-1234
|
||||||
|
database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # 👈 KEY CHANGE
|
||||||
|
```
|
||||||
|
|
||||||
|
Save config.yaml as `litellm_config.yaml` (used in 3.2).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**What is `general_settings`?**
|
||||||
|
|
||||||
|
These are settings for the LiteLLM Proxy Server.
|
||||||
|
|
||||||
|
See All General Settings [here](http://localhost:3000/docs/proxy/configs#all-settings).
|
||||||
|
|
||||||
|
1. **`master_key`** (`str`)
|
||||||
|
- **Description**:
|
||||||
|
- Set a `master key`, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with `sk-`).
|
||||||
|
- **Usage**:
|
||||||
|
- ** Set on config.yaml** set your master key under `general_settings:master_key`, example -
|
||||||
|
`master_key: sk-1234`
|
||||||
|
- ** Set env variable** set `LITELLM_MASTER_KEY`
|
||||||
|
|
||||||
|
2. **`database_url`** (str)
|
||||||
|
- **Description**:
|
||||||
|
- Set a `database_url`, this is the connection to your Postgres DB, which is used by litellm for generating keys, users, teams.
|
||||||
|
- **Usage**:
|
||||||
|
- ** Set on config.yaml** set your master key under `general_settings:database_url`, example -
|
||||||
|
`database_url: "postgresql://..."`
|
||||||
|
- Set `DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>` in your env
|
||||||
|
|
||||||
|
### 3.2 Start Proxy
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run \
|
||||||
|
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
|
||||||
|
-e AZURE_API_KEY=d6*********** \
|
||||||
|
-e AZURE_API_BASE=https://openai-***********/ \
|
||||||
|
-p 4000:4000 \
|
||||||
|
ghcr.io/berriai/litellm:main-latest \
|
||||||
|
--config /app/config.yaml --detailed_debug
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### 3.3 Create Key w/ RPM Limit
|
||||||
|
|
||||||
|
Create a key with `rpm_limit: 1`. This will only allow 1 request per minute for calls to proxy with this key.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
|
||||||
|
-H 'Authorization: Bearer sk-1234' \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{
|
||||||
|
"rpm_limit": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
[**See full API Spec**](https://litellm-api.up.railway.app/#/key%20management/generate_key_fn_key_generate_post)
|
||||||
|
|
||||||
|
**Expected Response**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{
|
||||||
|
"key": "sk-12..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.4 Test it!
|
||||||
|
|
||||||
|
**Use your virtual key from step 3.3**
|
||||||
|
|
||||||
|
1st call - Expect to work!
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-H 'Authorization: Bearer sk-12...' \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "how can I solve 8x + 7 = -23"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected Response**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{
|
||||||
|
"id": "chatcmpl-2076f062-3095-4052-a520-7c321c115c68",
|
||||||
|
"choices": [
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
2nd call - Expect to fail!
|
||||||
|
|
||||||
|
**Why did this call fail?**
|
||||||
|
|
||||||
|
We set the virtual key's requests per minute (RPM) limit to 1. This has now been crossed.
|
||||||
|
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-H 'Authorization: Bearer sk-12...' \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "how can I solve 8x + 7 = -23"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected Response**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"message": "Max parallel request limit reached. Hit limit for api_key: daa1b272072a4c6841470a488c5dad0f298ff506e1cc935f4a181eed90c182ad. tpm_limit: 100, current_tpm: 29, rpm_limit: 1, current_rpm: 2.",
|
||||||
|
"type": "None",
|
||||||
|
"param": "None",
|
||||||
|
"code": "429"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Useful Links
|
||||||
|
|
||||||
|
- [Creating Virtual Keys](./virtual_keys.md)
|
||||||
|
- [Key Management API Endpoints Swagger](https://litellm-api.up.railway.app/#/key%20management)
|
||||||
|
- [Set Budgets / Rate Limits per key/user/teams](./users.md)
|
||||||
|
- [Dynamic TPM/RPM Limits for keys](./team_budgets.md#dynamic-tpmrpm-allocation)
|
||||||
|
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Non-root docker image?
|
||||||
|
|
||||||
|
If you need to run the docker image as a non-root user, use [this](https://github.com/BerriAI/litellm/pkgs/container/litellm-non_root).
|
||||||
|
|
||||||
|
### SSL Verification Issue
|
||||||
|
|
||||||
|
If you see
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1006)
|
||||||
|
```
|
||||||
|
|
||||||
|
You can disable ssl verification with:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: gpt-3.5-turbo
|
||||||
|
litellm_params:
|
||||||
|
model: azure/my_azure_deployment
|
||||||
|
api_base: os.environ/AZURE_API_BASE
|
||||||
|
api_key: "os.environ/AZURE_API_KEY"
|
||||||
|
api_version: "2024-07-01-preview"
|
||||||
|
|
||||||
|
litellm_settings:
|
||||||
|
ssl_verify: false # 👈 KEY CHANGE
|
||||||
|
```
|
||||||
|
|
||||||
|
**What is `litellm_settings`?**
|
||||||
|
|
||||||
|
LiteLLM Proxy uses the [LiteLLM Python SDK](https://docs.litellm.ai/docs/routing) for handling LLM API calls.
|
||||||
|
|
||||||
|
`litellm_settings` are module-level params for the LiteLLM Python SDK (equivalent to doing `litellm.<some_param>` on the SDK). You can see all params [here](https://github.com/BerriAI/litellm/blob/208fe6cb90937f73e0def5c97ccb2359bf8a467b/litellm/__init__.py#L114)
|
||||||
|
|
||||||
|
## Support & Talk with founders
|
||||||
|
|
||||||
|
- [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
|
||||||
|
|
||||||
|
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
|
||||||
|
|
||||||
|
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
|
||||||
|
|
||||||
|
[](https://wa.link/huol9n) [](https://discord.gg/wuPM9dRgDw)
|
||||||
|
|
|
@ -283,8 +283,6 @@ litellm_settings:
|
||||||
|
|
||||||
**Covers all errors (429, 500, etc.)**
|
**Covers all errors (429, 500, etc.)**
|
||||||
|
|
||||||
[**See Code**]()
|
|
||||||
|
|
||||||
**Set via config**
|
**Set via config**
|
||||||
```yaml
|
```yaml
|
||||||
model_list:
|
model_list:
|
||||||
|
|
|
@ -29,6 +29,7 @@ const sidebars = {
|
||||||
},
|
},
|
||||||
items: [
|
items: [
|
||||||
"proxy/quick_start",
|
"proxy/quick_start",
|
||||||
|
"proxy/docker_quick_start",
|
||||||
"proxy/deploy",
|
"proxy/deploy",
|
||||||
"proxy/prod",
|
"proxy/prod",
|
||||||
{
|
{
|
||||||
|
|
|
@ -1,9 +1,9 @@
|
||||||
model_list:
|
model_list:
|
||||||
- model_name: fake-openai-endpoint
|
- model_name: my-fake-openai-endpoint
|
||||||
litellm_params:
|
litellm_params:
|
||||||
model: openai/my-fake-model
|
model: gpt-3.5-turbo
|
||||||
api_key: my-fake-key
|
api_key: "my-fake-key"
|
||||||
api_base: https://exampleopenaiendpoint-production.up.railway.app/
|
mock_response: "hello-world"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
litellm_settings:
|
||||||
|
ssl_verify: false
|
Loading…
Add table
Add a link
Reference in a new issue