forked from phoenix/litellm-mirror
Merge branch 'main' into litellm_dev_11_13_2024
This commit is contained in:
commit
1dcbfda202
76 changed files with 2836 additions and 560 deletions
|
@ -75,6 +75,7 @@ Works for:
|
|||
- Google AI Studio - Gemini models
|
||||
- Vertex AI models (Gemini + Anthropic)
|
||||
- Bedrock Models
|
||||
- Anthropic API Models
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
|
|
@ -93,7 +93,7 @@ curl http://0.0.0.0:4000/v1/chat/completions \
|
|||
|
||||
## Check Model Support
|
||||
|
||||
Call `litellm.get_model_info` to check if a model/provider supports `response_format`.
|
||||
Call `litellm.get_model_info` to check if a model/provider supports `prefix`.
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
@ -116,4 +116,4 @@ curl -X GET 'http://0.0.0.0:4000/v1/model/info' \
|
|||
-H 'Authorization: Bearer $LITELLM_KEY' \
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
</Tabs>
|
||||
|
|
|
@ -957,3 +957,69 @@ curl http://0.0.0.0:4000/v1/chat/completions \
|
|||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Usage - passing 'user_id' to Anthropic
|
||||
|
||||
LiteLLM translates the OpenAI `user` param to Anthropic's `metadata[user_id]` param.
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
```python
|
||||
response = completion(
|
||||
model="claude-3-5-sonnet-20240620",
|
||||
messages=messages,
|
||||
user="user_123",
|
||||
)
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="proxy" label="PROXY">
|
||||
|
||||
1. Setup config.yaml
|
||||
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: claude-3-5-sonnet-20240620
|
||||
litellm_params:
|
||||
model: anthropic/claude-3-5-sonnet-20240620
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
```
|
||||
|
||||
2. Start Proxy
|
||||
|
||||
```
|
||||
litellm --config /path/to/config.yaml
|
||||
```
|
||||
|
||||
3. Test it!
|
||||
|
||||
```bash
|
||||
curl http://0.0.0.0:4000/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
|
||||
-d '{
|
||||
"model": "claude-3-5-sonnet-20240620",
|
||||
"messages": [{"role": "user", "content": "What is Anthropic?"}],
|
||||
"user": "user_123"
|
||||
}'
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## All Supported OpenAI Params
|
||||
|
||||
```
|
||||
"stream",
|
||||
"stop",
|
||||
"temperature",
|
||||
"top_p",
|
||||
"max_tokens",
|
||||
"max_completion_tokens",
|
||||
"tools",
|
||||
"tool_choice",
|
||||
"extra_headers",
|
||||
"parallel_tool_calls",
|
||||
"response_format",
|
||||
"user"
|
||||
```
|
|
@ -37,7 +37,7 @@ os.environ["HUGGINGFACE_API_KEY"] = "huggingface_api_key"
|
|||
messages = [{ "content": "There's a llama in my garden 😱 What should I do?","role": "user"}]
|
||||
|
||||
# e.g. Call 'https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct' from Serverless Inference API
|
||||
response = litellm.completion(
|
||||
response = completion(
|
||||
model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct",
|
||||
messages=[{ "content": "Hello, how are you?","role": "user"}],
|
||||
stream=True
|
||||
|
@ -165,14 +165,14 @@ Steps to use
|
|||
|
||||
```python
|
||||
import os
|
||||
import litellm
|
||||
from litellm import completion
|
||||
|
||||
os.environ["HUGGINGFACE_API_KEY"] = ""
|
||||
|
||||
# TGI model: Call https://huggingface.co/glaiveai/glaive-coder-7b
|
||||
# add the 'huggingface/' prefix to the model to set huggingface as the provider
|
||||
# set api base to your deployed api endpoint from hugging face
|
||||
response = litellm.completion(
|
||||
response = completion(
|
||||
model="huggingface/glaiveai/glaive-coder-7b",
|
||||
messages=[{ "content": "Hello, how are you?","role": "user"}],
|
||||
api_base="https://wjiegasee9bmqke2.us-east-1.aws.endpoints.huggingface.cloud"
|
||||
|
@ -383,6 +383,8 @@ def default_pt(messages):
|
|||
#### Custom prompt templates
|
||||
|
||||
```python
|
||||
import litellm
|
||||
|
||||
# Create your own custom prompt template works
|
||||
litellm.register_prompt_template(
|
||||
model="togethercomputer/LLaMA-2-7B-32K",
|
||||
|
|
|
@ -1,6 +1,13 @@
|
|||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
# Jina AI
|
||||
https://jina.ai/embeddings/
|
||||
|
||||
Supported endpoints:
|
||||
- /embeddings
|
||||
- /rerank
|
||||
|
||||
## API Key
|
||||
```python
|
||||
# env variable
|
||||
|
@ -8,6 +15,10 @@ os.environ['JINA_AI_API_KEY']
|
|||
```
|
||||
|
||||
## Sample Usage - Embedding
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
```python
|
||||
from litellm import embedding
|
||||
import os
|
||||
|
@ -19,6 +30,142 @@ response = embedding(
|
|||
)
|
||||
print(response)
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="proxy" label="PROXY">
|
||||
|
||||
1. Add to config.yaml
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: embedding-model
|
||||
litellm_params:
|
||||
model: jina_ai/jina-embeddings-v3
|
||||
api_key: os.environ/JINA_AI_API_KEY
|
||||
```
|
||||
|
||||
2. Start proxy
|
||||
|
||||
```bash
|
||||
litellm --config /path/to/config.yaml
|
||||
|
||||
# RUNNING on http://0.0.0.0:4000/
|
||||
```
|
||||
|
||||
3. Test it!
|
||||
|
||||
```bash
|
||||
curl -L -X POST 'http://0.0.0.0:4000/embeddings' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"input": ["hello world"], "model": "embedding-model"}'
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Sample Usage - Rerank
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
```python
|
||||
from litellm import rerank
|
||||
import os
|
||||
|
||||
os.environ["JINA_AI_API_KEY"] = "sk-..."
|
||||
|
||||
query = "What is the capital of the United States?"
|
||||
documents = [
|
||||
"Carson City is the capital city of the American state of Nevada.",
|
||||
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
|
||||
"Washington, D.C. is the capital of the United States.",
|
||||
"Capital punishment has existed in the United States since before it was a country.",
|
||||
]
|
||||
|
||||
response = rerank(
|
||||
model="jina_ai/jina-reranker-v2-base-multilingual",
|
||||
query=query,
|
||||
documents=documents,
|
||||
top_n=3,
|
||||
)
|
||||
print(response)
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="proxy" label="PROXY">
|
||||
|
||||
1. Add to config.yaml
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: rerank-model
|
||||
litellm_params:
|
||||
model: jina_ai/jina-reranker-v2-base-multilingual
|
||||
api_key: os.environ/JINA_AI_API_KEY
|
||||
```
|
||||
|
||||
2. Start proxy
|
||||
|
||||
```bash
|
||||
litellm --config /path/to/config.yaml
|
||||
```
|
||||
|
||||
3. Test it!
|
||||
|
||||
```bash
|
||||
curl -L -X POST 'http://0.0.0.0:4000/rerank' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"model": "rerank-model",
|
||||
"query": "What is the capital of the United States?",
|
||||
"documents": [
|
||||
"Carson City is the capital city of the American state of Nevada.",
|
||||
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
|
||||
"Washington, D.C. is the capital of the United States.",
|
||||
"Capital punishment has existed in the United States since before it was a country."
|
||||
],
|
||||
"top_n": 3
|
||||
}'
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Supported Models
|
||||
All models listed here https://jina.ai/embeddings/ are supported
|
||||
|
||||
## Supported Optional Rerank Parameters
|
||||
|
||||
All cohere rerank parameters are supported.
|
||||
|
||||
## Supported Optional Embeddings Parameters
|
||||
|
||||
```
|
||||
dimensions
|
||||
```
|
||||
|
||||
## Provider-specific parameters
|
||||
|
||||
Pass any jina ai specific parameters as a keyword argument to the `embedding` or `rerank` function, e.g.
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
```python
|
||||
response = embedding(
|
||||
model="jina_ai/jina-embeddings-v3",
|
||||
input=["good morning from litellm"],
|
||||
dimensions=1536,
|
||||
my_custom_param="my_custom_value", # any other jina ai specific parameters
|
||||
)
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="proxy" label="PROXY">
|
||||
|
||||
```bash
|
||||
curl -L -X POST 'http://0.0.0.0:4000/embeddings' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"input": ["good morning from litellm"], "model": "jina_ai/jina-embeddings-v3", "dimensions": 1536, "my_custom_param": "my_custom_value"}'
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
|
|
@ -1562,6 +1562,10 @@ curl http://0.0.0.0:4000/v1/chat/completions \
|
|||
## **Embedding Models**
|
||||
|
||||
#### Usage - Embedding
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
```python
|
||||
import litellm
|
||||
from litellm import embedding
|
||||
|
@ -1574,6 +1578,49 @@ response = embedding(
|
|||
)
|
||||
print(response)
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="proxy" label="LiteLLM PROXY">
|
||||
|
||||
|
||||
1. Add model to config.yaml
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: snowflake-arctic-embed-m-long-1731622468876
|
||||
litellm_params:
|
||||
model: vertex_ai/<your-model-id>
|
||||
vertex_project: "adroit-crow-413218"
|
||||
vertex_location: "us-central1"
|
||||
vertex_credentials: adroit-crow-413218-a956eef1a2a8.json
|
||||
|
||||
litellm_settings:
|
||||
drop_params: True
|
||||
```
|
||||
|
||||
2. Start Proxy
|
||||
|
||||
```
|
||||
$ litellm --config /path/to/config.yaml
|
||||
```
|
||||
|
||||
3. Make Request using OpenAI Python SDK, Langchain Python SDK
|
||||
|
||||
```python
|
||||
import openai
|
||||
|
||||
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
|
||||
|
||||
response = client.embeddings.create(
|
||||
model="snowflake-arctic-embed-m-long-1731622468876",
|
||||
input = ["good morning from litellm", "this is another item"],
|
||||
)
|
||||
|
||||
print(response)
|
||||
```
|
||||
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
#### Supported Embedding Models
|
||||
All models listed [here](https://github.com/BerriAI/litellm/blob/57f37f743886a0249f630a6792d49dffc2c5d9b7/model_prices_and_context_window.json#L835) are supported
|
||||
|
@ -1589,6 +1636,7 @@ All models listed [here](https://github.com/BerriAI/litellm/blob/57f37f743886a02
|
|||
| textembedding-gecko@003 | `embedding(model="vertex_ai/textembedding-gecko@003", input)` |
|
||||
| text-embedding-preview-0409 | `embedding(model="vertex_ai/text-embedding-preview-0409", input)` |
|
||||
| text-multilingual-embedding-preview-0409 | `embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input)` |
|
||||
| Fine-tuned OR Custom Embedding models | `embedding(model="vertex_ai/<your-model-id>", input)` |
|
||||
|
||||
### Supported OpenAI (Unified) Params
|
||||
|
||||
|
|
|
@ -791,9 +791,9 @@ general_settings:
|
|||
| store_model_in_db | boolean | If true, allows `/model/new` endpoint to store model information in db. Endpoint disabled by default. [Doc on `/model/new` endpoint](./model_management.md#create-a-new-model) |
|
||||
| max_request_size_mb | int | The maximum size for requests in MB. Requests above this size will be rejected. |
|
||||
| max_response_size_mb | int | The maximum size for responses in MB. LLM Responses above this size will not be sent. |
|
||||
| proxy_budget_rescheduler_min_time | int | The minimum time (in seconds) to wait before checking db for budget resets. |
|
||||
| proxy_budget_rescheduler_max_time | int | The maximum time (in seconds) to wait before checking db for budget resets. |
|
||||
| proxy_batch_write_at | int | Time (in seconds) to wait before batch writing spend logs to the db. |
|
||||
| proxy_budget_rescheduler_min_time | int | The minimum time (in seconds) to wait before checking db for budget resets. **Default is 597 seconds** |
|
||||
| proxy_budget_rescheduler_max_time | int | The maximum time (in seconds) to wait before checking db for budget resets. **Default is 605 seconds** |
|
||||
| proxy_batch_write_at | int | Time (in seconds) to wait before batch writing spend logs to the db. **Default is 10 seconds** |
|
||||
| alerting_args | dict | Args for Slack Alerting [Doc on Slack Alerting](./alerting.md) |
|
||||
| custom_key_generate | str | Custom function for key generation [Doc on custom key generation](./virtual_keys.md#custom--key-generate) |
|
||||
| allowed_ips | List[str] | List of IPs allowed to access the proxy. If not set, all IPs are allowed. |
|
||||
|
|
|
@ -66,10 +66,16 @@ Removes any field with `user_api_key_*` from metadata.
|
|||
Found under `kwargs["standard_logging_object"]`. This is a standard payload, logged for every response.
|
||||
|
||||
```python
|
||||
|
||||
class StandardLoggingPayload(TypedDict):
|
||||
id: str
|
||||
trace_id: str # Trace multiple LLM calls belonging to same overall request (e.g. fallbacks/retries)
|
||||
call_type: str
|
||||
response_cost: float
|
||||
response_cost_failure_debug_info: Optional[
|
||||
StandardLoggingModelCostFailureDebugInformation
|
||||
]
|
||||
status: StandardLoggingPayloadStatus
|
||||
total_tokens: int
|
||||
prompt_tokens: int
|
||||
completion_tokens: int
|
||||
|
@ -84,13 +90,13 @@ class StandardLoggingPayload(TypedDict):
|
|||
metadata: StandardLoggingMetadata
|
||||
cache_hit: Optional[bool]
|
||||
cache_key: Optional[str]
|
||||
saved_cache_cost: Optional[float]
|
||||
request_tags: list
|
||||
saved_cache_cost: float
|
||||
request_tags: list
|
||||
end_user: Optional[str]
|
||||
requester_ip_address: Optional[str] # IP address of requester
|
||||
requester_metadata: Optional[dict] # metadata passed in request in the "metadata" field
|
||||
requester_ip_address: Optional[str]
|
||||
messages: Optional[Union[str, list, dict]]
|
||||
response: Optional[Union[str, list, dict]]
|
||||
error_str: Optional[str]
|
||||
model_parameters: dict
|
||||
hidden_params: StandardLoggingHiddenParams
|
||||
|
||||
|
@ -99,12 +105,47 @@ class StandardLoggingHiddenParams(TypedDict):
|
|||
cache_key: Optional[str]
|
||||
api_base: Optional[str]
|
||||
response_cost: Optional[str]
|
||||
additional_headers: Optional[dict]
|
||||
additional_headers: Optional[StandardLoggingAdditionalHeaders]
|
||||
|
||||
class StandardLoggingAdditionalHeaders(TypedDict, total=False):
|
||||
x_ratelimit_limit_requests: int
|
||||
x_ratelimit_limit_tokens: int
|
||||
x_ratelimit_remaining_requests: int
|
||||
x_ratelimit_remaining_tokens: int
|
||||
|
||||
class StandardLoggingMetadata(StandardLoggingUserAPIKeyMetadata):
|
||||
"""
|
||||
Specific metadata k,v pairs logged to integration for easier cost tracking
|
||||
"""
|
||||
|
||||
spend_logs_metadata: Optional[
|
||||
dict
|
||||
] # special param to log k,v pairs to spendlogs for a call
|
||||
requester_ip_address: Optional[str]
|
||||
requester_metadata: Optional[dict]
|
||||
|
||||
class StandardLoggingModelInformation(TypedDict):
|
||||
model_map_key: str
|
||||
model_map_value: Optional[ModelInfo]
|
||||
|
||||
|
||||
StandardLoggingPayloadStatus = Literal["success", "failure"]
|
||||
|
||||
class StandardLoggingModelCostFailureDebugInformation(TypedDict, total=False):
|
||||
"""
|
||||
Debug information, if cost tracking fails.
|
||||
|
||||
Avoid logging sensitive information like response or optional params
|
||||
"""
|
||||
|
||||
error_str: Required[str]
|
||||
traceback_str: Required[str]
|
||||
model: str
|
||||
cache_hit: Optional[bool]
|
||||
custom_llm_provider: Optional[str]
|
||||
base_model: Optional[str]
|
||||
call_type: str
|
||||
custom_pricing: Optional[bool]
|
||||
```
|
||||
|
||||
## Langfuse
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# ⚡ Best Practices for Production
|
||||
|
||||
|
@ -112,7 +113,35 @@ general_settings:
|
|||
disable_spend_logs: True
|
||||
```
|
||||
|
||||
## 7. Set LiteLLM Salt Key
|
||||
## 7. Use Helm PreSync Hook for Database Migrations [BETA]
|
||||
|
||||
To ensure only one service manages database migrations, use our [Helm PreSync hook for Database Migrations](https://github.com/BerriAI/litellm/blob/main/deploy/charts/litellm-helm/templates/migrations-job.yaml). This ensures migrations are handled during `helm upgrade` or `helm install`, while LiteLLM pods explicitly disable migrations.
|
||||
|
||||
|
||||
1. **Helm PreSync Hook**:
|
||||
- The Helm PreSync hook is configured in the chart to run database migrations during deployments.
|
||||
- The hook always sets `DISABLE_SCHEMA_UPDATE=false`, ensuring migrations are executed reliably.
|
||||
|
||||
Reference Settings to set on ArgoCD for `values.yaml`
|
||||
|
||||
```yaml
|
||||
db:
|
||||
useExisting: true # use existing Postgres DB
|
||||
url: postgresql://ishaanjaffer0324:3rnwpOBau6hT@ep-withered-mud-a5dkdpke.us-east-2.aws.neon.tech/test-argo-cd?sslmode=require # url of existing Postgres DB
|
||||
```
|
||||
|
||||
2. **LiteLLM Pods**:
|
||||
- Set `DISABLE_SCHEMA_UPDATE=true` in LiteLLM pod configurations to prevent them from running migrations.
|
||||
|
||||
Example configuration for LiteLLM pod:
|
||||
```yaml
|
||||
env:
|
||||
- name: DISABLE_SCHEMA_UPDATE
|
||||
value: "true"
|
||||
```
|
||||
|
||||
|
||||
## 8. Set LiteLLM Salt Key
|
||||
|
||||
If you plan on using the DB, set a salt key for encrypting/decrypting variables in the DB.
|
||||
|
||||
|
|
|
@ -748,4 +748,19 @@ curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
|
|||
"max_tokens": 300,
|
||||
"mock_testing_fallbacks": true
|
||||
}'
|
||||
```
|
||||
|
||||
### Disable Fallbacks per key
|
||||
|
||||
You can disable fallbacks per key by setting `disable_fallbacks: true` in your key metadata.
|
||||
|
||||
```bash
|
||||
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"metadata": {
|
||||
"disable_fallbacks": true
|
||||
}
|
||||
}'
|
||||
```
|
|
@ -113,4 +113,5 @@ curl http://0.0.0.0:4000/rerank \
|
|||
|-------------|--------------------|
|
||||
| Cohere | [Usage](#quick-start) |
|
||||
| Together AI| [Usage](../docs/providers/togetherai) |
|
||||
| Azure AI| [Usage](../docs/providers/azure_ai) |
|
||||
| Azure AI| [Usage](../docs/providers/azure_ai) |
|
||||
| Jina AI| [Usage](../docs/providers/jina_ai) |
|
|
@ -1,3 +1,6 @@
|
|||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
# Secret Manager
|
||||
LiteLLM supports reading secrets from Azure Key Vault, Google Secret Manager
|
||||
|
||||
|
@ -59,14 +62,35 @@ os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2
|
|||
```
|
||||
|
||||
2. Enable AWS Secret Manager in config.
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="read_only" label="Read Keys from AWS Secret Manager">
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
master_key: os.environ/litellm_master_key
|
||||
key_management_system: "aws_secret_manager" # 👈 KEY CHANGE
|
||||
key_management_settings:
|
||||
hosted_keys: ["litellm_master_key"] # 👈 Specify which env keys you stored on AWS
|
||||
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="write_only" label="Write Virtual Keys to AWS Secret Manager">
|
||||
|
||||
This will only store virtual keys in AWS Secret Manager. No keys will be read from AWS Secret Manager.
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
key_management_system: "aws_secret_manager" # 👈 KEY CHANGE
|
||||
key_management_settings:
|
||||
store_virtual_keys: true
|
||||
access_mode: "write_only" # Literal["read_only", "write_only", "read_and_write"]
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
3. Run proxy
|
||||
|
||||
```bash
|
||||
|
@ -181,16 +205,14 @@ litellm --config /path/to/config.yaml
|
|||
|
||||
Use encrypted keys from Google KMS on the proxy
|
||||
|
||||
### Usage with LiteLLM Proxy Server
|
||||
|
||||
## Step 1. Add keys to env
|
||||
Step 1. Add keys to env
|
||||
```
|
||||
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
|
||||
export GOOGLE_KMS_RESOURCE_NAME="projects/*/locations/*/keyRings/*/cryptoKeys/*"
|
||||
export PROXY_DATABASE_URL_ENCRYPTED=b'\n$\x00D\xac\xb4/\x8e\xc...'
|
||||
```
|
||||
|
||||
## Step 2: Update Config
|
||||
Step 2: Update Config
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
|
@ -199,7 +221,7 @@ general_settings:
|
|||
master_key: sk-1234
|
||||
```
|
||||
|
||||
## Step 3: Start + test proxy
|
||||
Step 3: Start + test proxy
|
||||
|
||||
```
|
||||
$ litellm --config /path/to/config.yaml
|
||||
|
@ -215,3 +237,17 @@ $ litellm --test
|
|||
<!--
|
||||
## .env Files
|
||||
If no secret manager client is specified, Litellm automatically uses the `.env` file to manage sensitive data. -->
|
||||
|
||||
|
||||
## All Secret Manager Settings
|
||||
|
||||
All settings related to secret management
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
key_management_system: "aws_secret_manager" # REQUIRED
|
||||
key_management_settings:
|
||||
store_virtual_keys: true # OPTIONAL. Defaults to False, when True will store virtual keys in secret manager
|
||||
access_mode: "write_only" # OPTIONAL. Literal["read_only", "write_only", "read_and_write"]. Defaults to "read_only"
|
||||
hosted_keys: ["litellm_master_key"] # OPTIONAL. Specify which env keys you stored on AWS
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue