diff --git a/docs/my-website/docs/proxy/configs.md b/docs/my-website/docs/proxy/configs.md index de95ce94a..71ce7de02 100644 --- a/docs/my-website/docs/proxy/configs.md +++ b/docs/my-website/docs/proxy/configs.md @@ -8,71 +8,46 @@ Set model list, `api_base`, `api_key`, `temperature` & proxy server settings (`m | `general_settings` | Server settings, example setting `master_key: sk-my_special_key` | | `environment_variables` | Environment Variables example, `REDIS_HOST`, `REDIS_PORT` | -#### Example Config +## Quick Start + +Set a model alias for your deployments. + +In the `config.yaml` the model_name parameter is the user-facing name to use for your deployment. + +In the config below requests with: +- `model=vllm-models` will route to `openai/facebook/opt-125m`. +- `model=gpt-3.5-turbo` will load balance between `azure/gpt-turbo-small-eu` and `azure/gpt-turbo-small-ca` + ```yaml model_list: - - model_name: gpt-3.5-turbo - litellm_params: + - model_name: gpt-3.5-turbo # user-facing model alias + litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input model: azure/gpt-turbo-small-eu api_base: https://my-endpoint-europe-berri-992.openai.azure.com/ - api_key: + api_key: "os.environ/AZURE_API_KEY_EU" # does os.getenv("AZURE_API_KEY_EU") rpm: 6 # Rate limit for this deployment: in requests per minute (rpm) - model_name: gpt-3.5-turbo litellm_params: model: azure/gpt-turbo-small-ca api_base: https://my-endpoint-canada-berri992.openai.azure.com/ - api_key: + api_key: "os.environ/AZURE_API_KEY_CA" rpm: 6 - - model_name: gpt-3.5-turbo + - model_name: vllm-models litellm_params: - model: azure/gpt-turbo-large - api_base: https://openai-france-1234.openai.azure.com/ - api_key: + model: openai/facebook/opt-125m # the `openai/` prefix tells litellm it's openai compatible + api_base: http://0.0.0.0:8000 rpm: 1440 + model_info: + version: 2 -litellm_settings: +litellm_settings: # module level litellm settings - https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py drop_params: True set_verbose: True general_settings: master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234) - - -environment_variables: - OPENAI_API_KEY: sk-123 - REPLICATE_API_KEY: sk-cohere-is-okay - REDIS_HOST: redis-16337.c322.us-east-1-2.ec2.cloud.redislabs.com - REDIS_PORT: "16337" - REDIS_PASSWORD: ``` -### Config for Multiple Models - GPT-4, Claude-2 - -Here's how you can use multiple llms with one proxy `config.yaml`. - -#### Step 1: Setup Config -```yaml -model_list: - - model_name: zephyr-alpha # the 1st model is the default on the proxy - litellm_params: # params for litellm.completion() - https://docs.litellm.ai/docs/completion/input#input---request-body - model: huggingface/HuggingFaceH4/zephyr-7b-alpha - api_base: http://0.0.0.0:8001 - - model_name: gpt-4 - litellm_params: - model: gpt-4 - api_key: sk-1233 - - model_name: claude-2 - litellm_params: - model: claude-2 - api_key: sk-claude -``` - -:::info - -The proxy uses the first model in the config as the default model - in this config the default model is `zephyr-alpha` -::: - - #### Step 2: Start Proxy with config ```shell @@ -96,32 +71,11 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \ ' ``` -### Config for Embedding Models - xorbitsai/inference - -Here's how you can use multiple llms with one proxy `config.yaml`. -Here is how [LiteLLM calls OpenAI Compatible Embedding models](https://docs.litellm.ai/docs/embedding/supported_embedding#openai-compatible-embedding-models) - -#### Config -```yaml -model_list: - - model_name: custom_embedding_model - litellm_params: - model: openai/custom_embedding # the `openai/` prefix tells litellm it's openai compatible - api_base: http://0.0.0.0:8000/ - - model_name: custom_embedding_model - litellm_params: - model: openai/custom_embedding # the `openai/` prefix tells litellm it's openai compatible - api_base: http://0.0.0.0:8001/ -``` - -Run the proxy using this config -```shell -$ litellm --config /path/to/config.yaml -``` - -### Save Model-specific params (API Base, API Keys, Temperature, Headers etc.) +## Save Model-specific params (API Base, API Keys, Temperature, Headers etc.) You can use the config to save model-specific information like api_base, api_key, temperature, max_tokens, etc. +[**All input params**](https://docs.litellm.ai/docs/completion/input#input-params-1) + **Step 1**: Create a `config.yaml` file ```yaml model_list: @@ -152,9 +106,11 @@ model_list: $ litellm --config /path/to/config.yaml ``` -### Load API Keys from Vault +## Load API Keys -If you have secrets saved in Azure Vault, etc. and don't want to expose them in the config.yaml, here's how to load model-specific keys from the environment. +### Load API Keys from Environment + +If you have secrets saved in your environment, and don't want to expose them in the config.yaml, here's how to load model-specific keys from the environment. ```python os.environ["AZURE_NORTH_AMERICA_API_KEY"] = "your-azure-api-key" @@ -174,30 +130,42 @@ model_list: s/o to [@David Manouchehri](https://www.linkedin.com/in/davidmanouchehri/) for helping with this. -### Config for setting Model Aliases +### Load API Keys from Azure Vault -Set a model alias for your deployments. +1. Install Proxy dependencies +```bash +$ pip install litellm[proxy] litellm[extra_proxy] +``` -In the `config.yaml` the model_name parameter is the user-facing name to use for your deployment. - -In the config below requests with `model=gpt-4` will route to `ollama/llama2` +2. Save Azure details in your environment +```bash +export["AZURE_CLIENT_ID"]="your-azure-app-client-id" +export["AZURE_CLIENT_SECRET"]="your-azure-app-client-secret" +export["AZURE_TENANT_ID"]="your-azure-tenant-id" +export["AZURE_KEY_VAULT_URI"]="your-azure-key-vault-uri" +``` +3. Add to proxy config.yaml ```yaml -model_list: - - model_name: text-davinci-003 - litellm_params: - model: ollama/zephyr - - model_name: gpt-4 - litellm_params: - model: ollama/llama2 - - model_name: gpt-3.5-turbo - litellm_params: - model: ollama/llama2 +model_list: + - model_name: "my-azure-models" # model alias + litellm_params: + model: "azure/" + api_key: "os.environ/AZURE-API-KEY" # reads from key vault - get_secret("AZURE_API_KEY") + api_base: "os.environ/AZURE-API-BASE" # reads from key vault - get_secret("AZURE_API_BASE") + +general_settings: + use_azure_key_vault: True +``` + +You can now test this by starting your proxy: +```bash +litellm --config /path/to/config.yaml ``` ### Set Custom Prompt Templates -LiteLLM by default checks if a model has a [prompt template and applies it](./completion/prompt_formatting.md) (e.g. if a huggingface model has a saved chat template in it's tokenizer_config.json). However, you can also set a custom prompt template on your proxy in the `config.yaml`: +LiteLLM by default checks if a model has a [prompt template and applies it](../completion/prompt_formatting.md) (e.g. if a huggingface model has a saved chat template in it's tokenizer_config.json). However, you can also set a custom prompt template on your proxy in the `config.yaml`: **Step 1**: Save your prompt template in a `config.yaml` ```yaml diff --git a/docs/my-website/docs/proxy/model_management.md b/docs/my-website/docs/proxy/model_management.md new file mode 100644 index 000000000..0cd4ab829 --- /dev/null +++ b/docs/my-website/docs/proxy/model_management.md @@ -0,0 +1,74 @@ +# Model Management +Add new models + Get model info without restarting proxy. + +## Get Model Information + +Retrieve detailed information about each model listed in the `/models` endpoint, including descriptions from the `config.yaml` file, and additional model info (e.g. max tokens, cost per input token, etc.) pulled the model_info you set and the litellm model cost map. Sensitive details like API keys are excluded for security purposes. + + + + +```bash +curl -X GET "http://0.0.0.0:8000/model/info" \ + -H "accept: application/json" \ +``` + + + +## Add a New Model + +Add a new model to the list in the `config.yaml` by providing the model parameters. This allows you to update the model list without restarting the proxy. + + + + +```bash +curl -X POST "http://0.0.0.0:8000/model/new" \ + -H "accept: application/json" \ + -H "Content-Type: application/json" \ + -d '{ "model_name": "azure-gpt-turbo", "litellm_params": {"model": "azure/gpt-3.5-turbo", "api_key": "os.environ/AZURE_API_KEY", "api_base": "my-azure-api-base"} }' +``` + + + + +### Model Parameters Structure + +When adding a new model, your JSON payload should conform to the following structure: + +- `model_name`: The name of the new model (required). +- `litellm_params`: A dictionary containing parameters specific to the Litellm setup (required). +- `model_info`: An optional dictionary to provide additional information about the model. + +Here's an example of how to structure your `ModelParams`: + +```json +{ + "model_name": "my_awesome_model", + "litellm_params": { + "some_parameter": "some_value", + "another_parameter": "another_value" + }, + "model_info": { + "author": "Your Name", + "version": "1.0", + "description": "A brief description of the model." + } +} +``` +--- + +Keep in mind that as both endpoints are in [BETA], you may need to visit the associated GitHub issues linked in the API descriptions to check for updates or provide feedback: + +- Get Model Information: [Issue #933](https://github.com/BerriAI/litellm/issues/933) +- Add a New Model: [Issue #964](https://github.com/BerriAI/litellm/issues/964) + +Feedback on the beta endpoints is valuable and helps improve the API for all users. \ No newline at end of file diff --git a/docs/my-website/docs/proxy/virtual_keys.md b/docs/my-website/docs/proxy/virtual_keys.md index 26c22e47a..ac1cf54de 100644 --- a/docs/my-website/docs/proxy/virtual_keys.md +++ b/docs/my-website/docs/proxy/virtual_keys.md @@ -1,5 +1,4 @@ - -# Cost Tracking & Virtual Keys +# Key Management Track Spend and create virtual keys for the proxy Grant other's temporary access to your proxy, with keys that expire after a set duration. diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index dd9aa514f..11f81fa4d 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -99,6 +99,7 @@ const sidebars = { "proxy/configs", "proxy/load_balancing", "proxy/virtual_keys", + "proxy/model_management", "proxy/caching", "proxy/logging", "proxy/cli",