update supported models docs

This commit is contained in:
Krrish Dholakia 2023-09-04 16:38:19 -07:00
parent 97e3bdb181
commit c885f353b0
15 changed files with 327 additions and 233 deletions

View file

@ -1,221 +0,0 @@
# Supported Providers + Models
## API Keys
liteLLM reads key naming, all keys should be named in the following format:
`<PROVIDER>_API_KEY` for example
* `OPENAI_API_KEY` Provider = OpenAI
* `TOGETHERAI_API_KEY` Provider = TogetherAI
* `HUGGINGFACE_API_KEY` Provider = HuggingFace
### OpenAI Chat Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|----------------------------------------|--------------------------------------|
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-0301 | `completion('gpt-3.5-turbo-0301', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-0613 | `completion('gpt-3.5-turbo-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-16k | `completion('gpt-3.5-turbo-16k', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-16k-0613 | `completion('gpt-3.5-turbo-16k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4 | `completion('gpt-4', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-0314 | `completion('gpt-4-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-0613 | `completion('gpt-4-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k | `completion('gpt-4-32k', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k-0314 | `completion('gpt-4-32k-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k-0613 | `completion('gpt-4-32k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
These also support the `OPENAI_API_BASE` environment variable, which can be used to specify a custom API endpoint.
### Azure OpenAI Chat Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|-----------------------------------------|-------------------------------------------|
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages, custom_llm_provider="azure")` | `os.environ['AZURE_API_KEY']`,`os.environ['AZURE_API_BASE']`,`os.environ['AZURE_API_VERSION']` |
| gpt-4 | `completion('gpt-4', messages, custom_llm_provider="azure")` | `os.environ['AZURE_API_KEY']`,`os.environ['AZURE_API_BASE']`,`os.environ['AZURE_API_VERSION']` |
### OpenAI Text Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| text-davinci-003 | `completion('text-davinci-003', messages)` | `os.environ['OPENAI_API_KEY']` |
| ada-001 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| curie-001 | `completion('curie-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| babbage-001 | `completion('babbage-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| babbage-002 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| davinci-002 | `completion('davinci-002', messages)` | `os.environ['OPENAI_API_KEY']` |
### Google VertexAI Models
Sample notebook for calling VertexAI models: https://github.com/BerriAI/litellm/blob/main/cookbook/liteLLM_VertextAI_Example.ipynb
All calls using Vertex AI require the following parameters:
* Your Project ID
`litellm.vertex_project` = "hardy-device-38811" Your Project ID
* Your Project Location
`litellm.vertex_location` = "us-central1"
Authentication:
VertexAI uses Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information on setting this up
VertexAI requires you to set `application_default_credentials.json`, this can be set by running `gcloud auth application-default login` in your terminal
| Model Name | Function Call |
|------------------|----------------------------------------------------------|
| chat-bison | `completion('chat-bison', messages)` |
| chat-bison@001 | `completion('chat-bison@001', messages)` |
| text-bison | `completion('text-bison', messages)` |
| text-bison@001 | `completion('text-bison@001', messages)` |
### Anthropic Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| claude-instant-1 | `completion('claude-instant-1', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
| claude-instant-1.2 | `completion('claude-instant-1.2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
| claude-2 | `completion('claude-2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
### Hugging Face Inference API
All [`text2text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text2text-generation&sort=downloads) and [`text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text-generation&sort=downloads) models are supported by liteLLM. You can use any text model from Hugging Face with the following steps:
* Copy the `model repo` URL from Hugging Face and set it as the `model` parameter in the completion call.
* Set `hugging_face` parameter to `True`.
* Make sure to set the hugging face API key
Here are some examples of supported models:
**Note that the models mentioned in the table are examples, and you can use any text model available on Hugging Face by following the steps above.**
| Model Name | Function Call | Required OS Variables |
|------------------|-------------------------------------------------------------------------------------|--------------------------------------|
| [stabilityai/stablecode-completion-alpha-3b-4k](https://huggingface.co/stabilityai/stablecode-completion-alpha-3b-4k) | `completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
| [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) | `completion(model="bigcode/starcoder", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
| [google/flan-t5-xxl](https://huggingface.co/google/flan-t5-xxl) | `completion(model="google/flan-t5-xxl", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
| [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) | `completion(model="google/flan-t5-large", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
### Replicate Models
liteLLM supports all replicate LLMs. For replicate models ensure to add a `replicate` prefix to the `model` arg. liteLLM detects it using this arg.
Below are examples on how to call replicate LLMs using liteLLM
Model Name | Function Call | Required OS Variables |
-----------------------------|----------------------------------------------------------------|--------------------------------------|
replicate/llama-2-70b-chat | `completion('replicate/replicate/llama-2-70b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-13b-chat| `completion('replicate/a16z-infra/llama-2-13b-chat', messages)`| `os.environ['REPLICATE_API_KEY']` |
joehoover/instructblip-vicuna13b | `completion('replicate/joehoover/instructblip-vicuna13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/dolly-v2-12b | `completion('replicate/replicate/dolly-v2-12b', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-7b-chat | `completion('replicate/a16z-infra/llama-2-7b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/vicuna-13b | `completion('replicate/replicate/vicuna-13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
daanelson/flan-t5-large | `completion('replicate/daanelson/flan-t5-large', messages)` | `os.environ['REPLICATE_API_KEY']` |
replit/replit-code-v1-3b | `completion('replicate/replit/replit-code-v1-3b', messages)` | `os.environ['REPLICATE_API_KEY']` |
### AI21 Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| j2-light | `completion('j2-light', messages)` | `os.environ['AI21_API_KEY']` |
| j2-mid | `completion('j2-mid', messages)` | `os.environ['AI21_API_KEY']` |
| j2-ultra | `completion('j2-ultra', messages)` | `os.environ['AI21_API_KEY']` |
### Cohere Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| command-nightly | `completion('command-nightly', messages)` | `os.environ['COHERE_API_KEY']` |
### Together AI Models
liteLLM supports `non-streaming` and `streaming` requests to all models on https://api.together.xyz/
Example TogetherAI Usage - Note: liteLLM supports all models deployed on TogetherAI
| Model Name | Function Call | Required OS Variables |
|-----------------------------------|------------------------------------------------------------------------|---------------------------------|
| togethercomputer/llama-2-70b-chat | `completion('togethercomputer/llama-2-70b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/LLaMA-2-13b-chat | `completion('togethercomputer/LLaMA-2-13b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/code-and-talk-v1 | `completion('togethercomputer/code-and-talk-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/creative-v1 | `completion('togethercomputer/creative-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/yourmodel | `completion('togethercomputer/yourmodel', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
### AWS Sagemaker Models
https://aws.amazon.com/sagemaker/ Use liteLLM to easily call custom LLMs on Sagemaker
#### Requirements using Sagemaker with LiteLLM
* `pip install boto3`
* Set the following AWS credentials as .env variables (Sagemaker auth: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html)
* AWS_ACCESS_KEY_ID
* AWS_SECRET_ACCESS_KEY
* AWS_REGION_NAME
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| Llama2 7B | `completion(model='sagemaker/jumpstart-dft-meta-textgeneration-llama-2-7b, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
| Custom LLM Endpoint | `completion(model='sagemaker/your-endpoint, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
### Aleph Alpha Models
https://www.aleph-alpha.com/
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| luminous-base | `completion(model='luminous-base', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-extended | `completion(model='luminous-extended', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-supreme | `completion(model='luminous-supreme', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-base-control | `completion(model='luminous-base-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-extended-control | `completion(model='luminous-extended-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-supreme-control | `completion(model='luminous-supreme-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
### Baseten Models
Baseten provides infrastructure to deploy and serve ML models https://www.baseten.co/. Use liteLLM to easily call models deployed on Baseten.
Example Baseten Usage - Note: liteLLM supports all models deployed on Baseten
Usage: Pass `model=baseten/<Model ID>`
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| Falcon 7B | `completion(model='baseten/qvv0xeq', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| Wizard LM | `completion(model='baseten/q841o8w', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| MPT 7B Base | `completion(model='baseten/31dxrj3', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
### OpenRouter Completion Models
All the text models from [OpenRouter](https://openrouter.ai/docs) are supported by liteLLM.
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| openai/gpt-3.5-turbo | `completion('openai/gpt-3.5-turbo', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-3.5-turbo-16k | `completion('openai/gpt-3.5-turbo-16k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-4 | `completion('openai/gpt-4', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-4-32k | `completion('openai/gpt-4-32k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| anthropic/claude-2 | `completion('anthropic/claude-2', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| anthropic/claude-instant-v1 | `completion('anthropic/claude-instant-v1', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| google/palm-2-chat-bison | `completion('google/palm-2-chat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| google/palm-2-codechat-bison | `completion('google/palm-2-codechat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| meta-llama/llama-2-13b-chat | `completion('meta-llama/llama-2-13b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| meta-llama/llama-2-70b-chat | `completion('meta-llama/llama-2-70b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
### Petals Models
Supported models on https://chat.petals.dev/
| Model Name | Function Call | Required OS Variables |
|----------------------|------------------------------------------------------------------------|--------------------------------|
| stabilityai/StableBeluga2 | `completion(model='stabilityai/StableBeluga2', messages, custom_llm_provider="petals")` | No API Key required |
| enoch/llama-65b-hf | `completion(model='enoch/llama-65b-hf', messages, custom_llm_provider="petals")` | No API Key required |
| bigscience/bloomz | `completion(model='bigscience/bloomz', messages, custom_llm_provider="petals")` | No API Key required |
### Ollama Models
Ollama supported models: https://github.com/jmorganca/ollama
| Model Name | Function Call | Required OS Variables |
|----------------------|-----------------------------------------------------------------------------------|--------------------------------|
| Llama2 7B | `completion(model='llama2', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 13B | `completion(model='llama2:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 70B | `completion(model='llama2:70b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 Uncensored | `completion(model='llama2-uncensored', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Orca Mini | `completion(model='orca-mini', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Vicuna | `completion(model='vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Nous-Hermes | `completion(model='nous-hermes', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Nous-Hermes 13B | `completion(model='nous-hermes:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Wizard Vicuna Uncensored | `completion(model='wizard-vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |

View file

@ -0,0 +1,19 @@
# AI21
LiteLLM supports j2-light, j2-mid and j2-ultra from [AI21](https://www.ai21.com/studio/pricing).
They're available to use without a waitlist.
### API KEYS
```python
import os
os.environ["AI21_API_KEY"] = ""
```
### AI21 Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| j2-light | `completion('j2-light', messages)` | `os.environ['AI21_API_KEY']` |
| j2-mid | `completion('j2-mid', messages)` | `os.environ['AI21_API_KEY']` |
| j2-ultra | `completion('j2-ultra', messages)` | `os.environ['AI21_API_KEY']` |

View file

@ -0,0 +1,23 @@
# Aleph Alpha
LiteLLM supports the 'luminous' + 'luminous-control' series of models from [Aleph Alpha](https://www.aleph-alpha.com/).
Like AI21 and Cohere, you can use these models without a waitlist.
### API KEYS
```python
import os
os.environ["ALEPHALPHA_API_KEY"] = ""
```
### Aleph Alpha Models
https://www.aleph-alpha.com/
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| luminous-base | `completion(model='luminous-base', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-base-control | `completion(model='luminous-base-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-extended | `completion(model='luminous-extended', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-extended-control | `completion(model='luminous-extended-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-supreme | `completion(model='luminous-supreme', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
| luminous-supreme-control | `completion(model='luminous-supreme-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |

View file

@ -0,0 +1,18 @@
# Anthropic
LiteLLM supports Claude-1, 1.2 and Claude-2.
### API KEYS
```
import os
os.environ["ANTHROPIC_API_KEY"] = ""
```
### Model Details
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| claude-instant-1 | `completion('claude-instant-1', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
| claude-instant-1.2 | `completion('claude-instant-1.2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
| claude-2 | `completion('claude-2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |

View file

@ -0,0 +1,19 @@
# AWS Sagemaker
LiteLLM supports Llama2 on Sagemaker
### API KEYS
```python
!pip install boto3
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
```
### AWS Sagemaker Models
Here's an example of using a sagemaker model with LiteLLM
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| Llama2 7B | `completion(model='sagemaker/jumpstart-dft-meta-textgeneration-llama-2-7b, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
| Custom LLM Endpoint | `completion(model='sagemaker/your-endpoint, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |

View file

@ -0,0 +1,32 @@
# Azure
LiteLLM supports Azure Chat + Embedding calls.
### API KEYS
```
import os
os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""
```
### Azure OpenAI Chat Completion Models
```python
from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["AZURE_API_KEY"] = "azure key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# azure call
response = completion("azure/<your_deployment_id>", messages)
```

View file

@ -0,0 +1,23 @@
# Baseten
LiteLLM supports any Text-Gen-Interface models on Baseten.
[Here's a tutorial on deploying a huggingface TGI model (Llama2, CodeLlama, WizardCoder, Falcon, etc.) on Baseten](https://truss.baseten.co/examples/performance/tgi-server)
### API KEYS
```python
import os
os.environ["BASETEN_API_KEY"] = ""
```
### Baseten Models
Baseten provides infrastructure to deploy and serve ML models https://www.baseten.co/. Use liteLLM to easily call models deployed on Baseten.
Example Baseten Usage - Note: liteLLM supports all models deployed on Baseten
Usage: Pass `model=baseten/<Model ID>`
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| Falcon 7B | `completion(model='baseten/qvv0xeq', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| Wizard LM | `completion(model='baseten/q841o8w', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| MPT 7B Base | `completion(model='baseten/31dxrj3', messages=messages)` | `os.environ['BASETEN_API_KEY']` |

View file

@ -0,0 +1,32 @@
# Cohere
LiteLLM supports 'command', 'command-light', 'command-medium', 'command-medium-beta', 'command-xlarge-beta', 'command-nightly' models from [Cohere](https://cohere.com/).
Like AI21, these models are available without a waitlist.
### API KEYS
```python
import os
os.environ["COHERE_API_KEY"] = ""
```
### Example Usage
```python
from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
```

View file

@ -0,0 +1,17 @@
# Ollama
LiteLLM supports all models from [Ollama](https://github.com/jmorganca/ollama)
### Ollama Models
Ollama supported models: https://github.com/jmorganca/ollama
| Model Name | Function Call | Required OS Variables |
|----------------------|-----------------------------------------------------------------------------------|--------------------------------|
| Llama2 7B | `completion(model='llama2', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 13B | `completion(model='llama2:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 70B | `completion(model='llama2:70b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Llama2 Uncensored | `completion(model='llama2-uncensored', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Orca Mini | `completion(model='orca-mini', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Vicuna | `completion(model='vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Nous-Hermes | `completion(model='nous-hermes', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Nous-Hermes 13B | `completion(model='nous-hermes:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
| Wizard Vicuna Uncensored | `completion(model='wizard-vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |

View file

@ -0,0 +1,38 @@
# OpenAI
LiteLLM supports OpenAI Chat + Text completion and embedding calls.
### API KEYS
```
import os
os.environ["OPENAI_API_KEY"] = ""
```
### OpenAI Chat Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|----------------------------------------|--------------------------------------|
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-0301 | `completion('gpt-3.5-turbo-0301', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-0613 | `completion('gpt-3.5-turbo-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-16k | `completion('gpt-3.5-turbo-16k', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-3.5-turbo-16k-0613 | `completion('gpt-3.5-turbo-16k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4 | `completion('gpt-4', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-0314 | `completion('gpt-4-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-0613 | `completion('gpt-4-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k | `completion('gpt-4-32k', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k-0314 | `completion('gpt-4-32k-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4-32k-0613 | `completion('gpt-4-32k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
These also support the `OPENAI_API_BASE` environment variable, which can be used to specify a custom API endpoint.
### OpenAI Text Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| text-davinci-003 | `completion('text-davinci-003', messages)` | `os.environ['OPENAI_API_KEY']` |
| ada-001 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| curie-001 | `completion('curie-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| babbage-001 | `completion('babbage-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| babbage-002 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
| davinci-002 | `completion('davinci-002', messages)` | `os.environ['OPENAI_API_KEY']` |

View file

@ -0,0 +1,24 @@
# OpenRouter
LiteLLM supports all the text models from [OpenRouter](https://openrouter.ai/docs)
### API KEYS
```python
import os
os.environ["OPENROUTER_API_KEYS"] = ""
```
### OpenRouter Completion Models
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|--------------------------------------|
| openai/gpt-3.5-turbo | `completion('openai/gpt-3.5-turbo', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-3.5-turbo-16k | `completion('openai/gpt-3.5-turbo-16k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-4 | `completion('openai/gpt-4', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| openai/gpt-4-32k | `completion('openai/gpt-4-32k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| anthropic/claude-2 | `completion('anthropic/claude-2', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| anthropic/claude-instant-v1 | `completion('anthropic/claude-instant-v1', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| google/palm-2-chat-bison | `completion('google/palm-2-chat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| google/palm-2-codechat-bison | `completion('google/palm-2-codechat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| meta-llama/llama-2-13b-chat | `completion('meta-llama/llama-2-13b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
| meta-llama/llama-2-70b-chat | `completion('meta-llama/llama-2-70b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |

View file

@ -0,0 +1,25 @@
# Replicate
LiteLLM supports all models on Replicate
### API KEYS
```python
import os
os.environ["REPLICATE_API_KEY"] = ""
```
### Replicate Models
liteLLM supports all replicate LLMs. For replicate models ensure to add a `replicate` prefix to the `model` arg. liteLLM detects it using this arg.
Below are examples on how to call replicate LLMs using liteLLM
Model Name | Function Call | Required OS Variables |
-----------------------------|----------------------------------------------------------------|--------------------------------------|
replicate/llama-2-70b-chat | `completion('replicate/replicate/llama-2-70b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-13b-chat| `completion('replicate/a16z-infra/llama-2-13b-chat', messages)`| `os.environ['REPLICATE_API_KEY']` |
joehoover/instructblip-vicuna13b | `completion('replicate/joehoover/instructblip-vicuna13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/dolly-v2-12b | `completion('replicate/replicate/dolly-v2-12b', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-7b-chat | `completion('replicate/a16z-infra/llama-2-7b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/vicuna-13b | `completion('replicate/replicate/vicuna-13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
daanelson/flan-t5-large | `completion('replicate/daanelson/flan-t5-large', messages)` | `os.environ['REPLICATE_API_KEY']` |
replit/replit-code-v1-3b | `completion('replicate/replit/replit-code-v1-3b', messages)` | `os.environ['REPLICATE_API_KEY']` |

View file

@ -0,0 +1,22 @@
# Together AI
LiteLLM supports all models on Together AI.
### API KEYS
```python
import os
os.environ["TOGETHERAI_API_KEY"] = ""
```
### Together AI Models
liteLLM supports `non-streaming` and `streaming` requests to all models on https://api.together.xyz/
Example TogetherAI Usage - Note: liteLLM supports all models deployed on TogetherAI
| Model Name | Function Call | Required OS Variables |
|-----------------------------------|------------------------------------------------------------------------|---------------------------------|
| togethercomputer/llama-2-70b-chat | `completion('togethercomputer/llama-2-70b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/LLaMA-2-13b-chat | `completion('togethercomputer/LLaMA-2-13b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/code-and-talk-v1 | `completion('togethercomputer/code-and-talk-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/creative-v1 | `completion('togethercomputer/creative-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
| togethercomputer/yourmodel | `completion('togethercomputer/yourmodel', messages)` | `os.environ['TOGETHERAI_API_KEY']` |

View file

@ -0,0 +1,24 @@
# Google Palm/VertexAI
LiteLLM supports chat-bison, chat-bison@001, text-bison, text-bison@001
### Google VertexAI Models
Sample notebook for calling VertexAI models: https://github.com/BerriAI/litellm/blob/main/cookbook/liteLLM_VertextAI_Example.ipynb
All calls using Vertex AI require the following parameters:
* Your Project ID
`litellm.vertex_project` = "hardy-device-38811" Your Project ID
* Your Project Location
`litellm.vertex_location` = "us-central1"
Authentication:
VertexAI uses Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information on setting this up
VertexAI requires you to set `application_default_credentials.json`, this can be set by running `gcloud auth application-default login` in your terminal
| Model Name | Function Call |
|------------------|----------------------------------------------------------|
| chat-bison | `completion('chat-bison', messages)` |
| chat-bison@001 | `completion('chat-bison@001', messages)` |
| text-bison | `completion('text-bison', messages)` |
| text-bison@001 | `completion('text-bison@001', messages)` |

View file

@ -29,18 +29,17 @@ const sidebars = {
label: "Embedding()", label: "Embedding()",
items: ["embedding/supported_embedding"], items: ["embedding/supported_embedding"],
}, },
'completion/supported', {
// { type: "category",
// type: "category", label: "Supported Models + Providers",
// label: "Providers", link: {
// link: { type: 'generated-index',
// type: 'generated-index', title: 'Providers',
// title: 'Providers', description: 'Learn how to deploy + call models from different providers on LiteLLM',
// description: 'Learn how to deploy + call models from different providers on LiteLLM', slug: '/providers',
// slug: '/providers', },
// }, items: ["providers/huggingface", "providers/openai", "providers/azure", "providers/vertex", "providers/anthropic", "providers/ai21", "providers/replicate", "providers/cohere", "providers/togetherai", "providers/aws_sagemaker", "providers/aleph_alpha", "providers/baseten", "providers/openrouter", "providers/ollama"]
// items: ["providers/huggingface"], },
// },
"token_usage", "token_usage",
"exception_mapping", "exception_mapping",
'debugging/local_debugging', 'debugging/local_debugging',