forked from phoenix/litellm-mirror
docs(routing.md): fix deployment tutorial
This commit is contained in:
parent
b288db52fc
commit
aec4de68a7
1 changed files with 21 additions and 35 deletions
|
@ -67,50 +67,36 @@ router = Router(model_list=model_list,
|
|||
print(response)
|
||||
```
|
||||
|
||||
## Handle Multiple Azure Deployments via OpenAI Proxy Server
|
||||
## Deploy Router
|
||||
|
||||
#### 1. Clone repo
|
||||
1. Clone repo
|
||||
```shell
|
||||
git clone https://github.com/BerriAI/litellm.git
|
||||
git clone https://github.com/BerriAI/litellm
|
||||
```
|
||||
|
||||
#### 2. Add Azure/OpenAI deployments to `secrets_template.toml`
|
||||
```python
|
||||
[model."gpt-3.5-turbo"] # model name passed in /chat/completion call or `litellm --model gpt-3.5-turbo`
|
||||
model_list = [{ # list of model deployments
|
||||
"model_name": "gpt-3.5-turbo", # openai model name
|
||||
"litellm_params": { # params for litellm completion/embedding call
|
||||
"model": "azure/chatgpt-v-2",
|
||||
"api_key": "my-azure-api-key-1",
|
||||
"api_version": "my-azure-api-version-1",
|
||||
"api_base": "my-azure-api-base-1"
|
||||
},
|
||||
"tpm": 240000,
|
||||
"rpm": 1800
|
||||
}, {
|
||||
"model_name": "gpt-3.5-turbo", # openai model name
|
||||
"litellm_params": { # params for litellm completion/embedding call
|
||||
"model": "gpt-3.5-turbo",
|
||||
"api_key": "sk-...",
|
||||
},
|
||||
"tpm": 1000000,
|
||||
"rpm": 9000
|
||||
}]
|
||||
```
|
||||
2. Create + Modify router_config.yaml (save your azure/openai/etc. deployment info)
|
||||
|
||||
#### 3. Run with Docker Image
|
||||
```shell
|
||||
docker build -t litellm . && docker run -p 8000:8000 litellm
|
||||
|
||||
## OpenAI Compatible Endpoint at: http://0.0.0.0:8000
|
||||
cp ./router_config_template.yaml ./router_config.yaml
|
||||
```
|
||||
|
||||
**replace openai base**
|
||||
3. Build + Run docker image
|
||||
|
||||
```python
|
||||
import openai
|
||||
```shell
|
||||
docker build -t litellm-proxy . --build-arg CONFIG_FILE=./router_config.yaml
|
||||
```
|
||||
|
||||
openai.api_base = "http://0.0.0.0:8000"
|
||||
```shell
|
||||
docker run --name litellm-proxy -e PORT=8000 -p 8000:8000 litellm-proxy
|
||||
```
|
||||
|
||||
print(openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[{"role":"user", "content":"Hey!"}]))
|
||||
### Test
|
||||
|
||||
```curl
|
||||
curl 'http://0.0.0.0:8000/router/completions' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data '{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"messages": [{"role": "user", "content": "Hey"}]
|
||||
}'
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue