mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-26 03:04:13 +00:00
update supported models docs
This commit is contained in:
parent
97e3bdb181
commit
c885f353b0
15 changed files with 327 additions and 233 deletions
|
@ -1,221 +0,0 @@
|
|||
# Supported Providers + Models
|
||||
|
||||
## API Keys
|
||||
liteLLM reads key naming, all keys should be named in the following format:
|
||||
`<PROVIDER>_API_KEY` for example
|
||||
* `OPENAI_API_KEY` Provider = OpenAI
|
||||
* `TOGETHERAI_API_KEY` Provider = TogetherAI
|
||||
* `HUGGINGFACE_API_KEY` Provider = HuggingFace
|
||||
|
||||
|
||||
### OpenAI Chat Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|----------------------------------------|--------------------------------------|
|
||||
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-0301 | `completion('gpt-3.5-turbo-0301', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-0613 | `completion('gpt-3.5-turbo-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-16k | `completion('gpt-3.5-turbo-16k', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-16k-0613 | `completion('gpt-3.5-turbo-16k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4 | `completion('gpt-4', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-0314 | `completion('gpt-4-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-0613 | `completion('gpt-4-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k | `completion('gpt-4-32k', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k-0314 | `completion('gpt-4-32k-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k-0613 | `completion('gpt-4-32k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
|
||||
These also support the `OPENAI_API_BASE` environment variable, which can be used to specify a custom API endpoint.
|
||||
|
||||
### Azure OpenAI Chat Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|-----------------------------------------|-------------------------------------------|
|
||||
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages, custom_llm_provider="azure")` | `os.environ['AZURE_API_KEY']`,`os.environ['AZURE_API_BASE']`,`os.environ['AZURE_API_VERSION']` |
|
||||
| gpt-4 | `completion('gpt-4', messages, custom_llm_provider="azure")` | `os.environ['AZURE_API_KEY']`,`os.environ['AZURE_API_BASE']`,`os.environ['AZURE_API_VERSION']` |
|
||||
|
||||
### OpenAI Text Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| text-davinci-003 | `completion('text-davinci-003', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| ada-001 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| curie-001 | `completion('curie-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| babbage-001 | `completion('babbage-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| babbage-002 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| davinci-002 | `completion('davinci-002', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
|
||||
|
||||
### Google VertexAI Models
|
||||
Sample notebook for calling VertexAI models: https://github.com/BerriAI/litellm/blob/main/cookbook/liteLLM_VertextAI_Example.ipynb
|
||||
|
||||
All calls using Vertex AI require the following parameters:
|
||||
* Your Project ID
|
||||
`litellm.vertex_project` = "hardy-device-38811" Your Project ID
|
||||
* Your Project Location
|
||||
`litellm.vertex_location` = "us-central1"
|
||||
|
||||
Authentication:
|
||||
VertexAI uses Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information on setting this up
|
||||
|
||||
VertexAI requires you to set `application_default_credentials.json`, this can be set by running `gcloud auth application-default login` in your terminal
|
||||
|
||||
| Model Name | Function Call |
|
||||
|------------------|----------------------------------------------------------|
|
||||
| chat-bison | `completion('chat-bison', messages)` |
|
||||
| chat-bison@001 | `completion('chat-bison@001', messages)` |
|
||||
| text-bison | `completion('text-bison', messages)` |
|
||||
| text-bison@001 | `completion('text-bison@001', messages)` |
|
||||
|
||||
### Anthropic Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| claude-instant-1 | `completion('claude-instant-1', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
||||
| claude-instant-1.2 | `completion('claude-instant-1.2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
||||
| claude-2 | `completion('claude-2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
||||
|
||||
### Hugging Face Inference API
|
||||
|
||||
All [`text2text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text2text-generation&sort=downloads) and [`text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text-generation&sort=downloads) models are supported by liteLLM. You can use any text model from Hugging Face with the following steps:
|
||||
|
||||
* Copy the `model repo` URL from Hugging Face and set it as the `model` parameter in the completion call.
|
||||
* Set `hugging_face` parameter to `True`.
|
||||
* Make sure to set the hugging face API key
|
||||
|
||||
Here are some examples of supported models:
|
||||
**Note that the models mentioned in the table are examples, and you can use any text model available on Hugging Face by following the steps above.**
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|-------------------------------------------------------------------------------------|--------------------------------------|
|
||||
| [stabilityai/stablecode-completion-alpha-3b-4k](https://huggingface.co/stabilityai/stablecode-completion-alpha-3b-4k) | `completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
|
||||
| [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) | `completion(model="bigcode/starcoder", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
|
||||
| [google/flan-t5-xxl](https://huggingface.co/google/flan-t5-xxl) | `completion(model="google/flan-t5-xxl", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
|
||||
| [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) | `completion(model="google/flan-t5-large", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
|
||||
|
||||
### Replicate Models
|
||||
liteLLM supports all replicate LLMs. For replicate models ensure to add a `replicate` prefix to the `model` arg. liteLLM detects it using this arg.
|
||||
Below are examples on how to call replicate LLMs using liteLLM
|
||||
|
||||
Model Name | Function Call | Required OS Variables |
|
||||
-----------------------------|----------------------------------------------------------------|--------------------------------------|
|
||||
replicate/llama-2-70b-chat | `completion('replicate/replicate/llama-2-70b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
a16z-infra/llama-2-13b-chat| `completion('replicate/a16z-infra/llama-2-13b-chat', messages)`| `os.environ['REPLICATE_API_KEY']` |
|
||||
joehoover/instructblip-vicuna13b | `completion('replicate/joehoover/instructblip-vicuna13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replicate/dolly-v2-12b | `completion('replicate/replicate/dolly-v2-12b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
a16z-infra/llama-2-7b-chat | `completion('replicate/a16z-infra/llama-2-7b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replicate/vicuna-13b | `completion('replicate/replicate/vicuna-13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
daanelson/flan-t5-large | `completion('replicate/daanelson/flan-t5-large', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replit/replit-code-v1-3b | `completion('replicate/replit/replit-code-v1-3b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
|
||||
### AI21 Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| j2-light | `completion('j2-light', messages)` | `os.environ['AI21_API_KEY']` |
|
||||
| j2-mid | `completion('j2-mid', messages)` | `os.environ['AI21_API_KEY']` |
|
||||
| j2-ultra | `completion('j2-ultra', messages)` | `os.environ['AI21_API_KEY']` |
|
||||
|
||||
### Cohere Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| command-nightly | `completion('command-nightly', messages)` | `os.environ['COHERE_API_KEY']` |
|
||||
|
||||
### Together AI Models
|
||||
liteLLM supports `non-streaming` and `streaming` requests to all models on https://api.together.xyz/
|
||||
|
||||
Example TogetherAI Usage - Note: liteLLM supports all models deployed on TogetherAI
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|-----------------------------------|------------------------------------------------------------------------|---------------------------------|
|
||||
| togethercomputer/llama-2-70b-chat | `completion('togethercomputer/llama-2-70b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/LLaMA-2-13b-chat | `completion('togethercomputer/LLaMA-2-13b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/code-and-talk-v1 | `completion('togethercomputer/code-and-talk-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/creative-v1 | `completion('togethercomputer/creative-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/yourmodel | `completion('togethercomputer/yourmodel', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
|
||||
### AWS Sagemaker Models
|
||||
https://aws.amazon.com/sagemaker/ Use liteLLM to easily call custom LLMs on Sagemaker
|
||||
|
||||
#### Requirements using Sagemaker with LiteLLM
|
||||
|
||||
* `pip install boto3`
|
||||
* Set the following AWS credentials as .env variables (Sagemaker auth: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html)
|
||||
* AWS_ACCESS_KEY_ID
|
||||
* AWS_SECRET_ACCESS_KEY
|
||||
* AWS_REGION_NAME
|
||||
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| Llama2 7B | `completion(model='sagemaker/jumpstart-dft-meta-textgeneration-llama-2-7b, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
|
||||
| Custom LLM Endpoint | `completion(model='sagemaker/your-endpoint, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
|
||||
|
||||
### Aleph Alpha Models
|
||||
https://www.aleph-alpha.com/
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| luminous-base | `completion(model='luminous-base', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-extended | `completion(model='luminous-extended', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-supreme | `completion(model='luminous-supreme', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-base-control | `completion(model='luminous-base-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-extended-control | `completion(model='luminous-extended-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-supreme-control | `completion(model='luminous-supreme-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
|
||||
|
||||
|
||||
### Baseten Models
|
||||
Baseten provides infrastructure to deploy and serve ML models https://www.baseten.co/. Use liteLLM to easily call models deployed on Baseten.
|
||||
|
||||
Example Baseten Usage - Note: liteLLM supports all models deployed on Baseten
|
||||
|
||||
Usage: Pass `model=baseten/<Model ID>`
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| Falcon 7B | `completion(model='baseten/qvv0xeq', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
||||
| Wizard LM | `completion(model='baseten/q841o8w', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
||||
| MPT 7B Base | `completion(model='baseten/31dxrj3', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
||||
|
||||
|
||||
### OpenRouter Completion Models
|
||||
|
||||
All the text models from [OpenRouter](https://openrouter.ai/docs) are supported by liteLLM.
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| openai/gpt-3.5-turbo | `completion('openai/gpt-3.5-turbo', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-3.5-turbo-16k | `completion('openai/gpt-3.5-turbo-16k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-4 | `completion('openai/gpt-4', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-4-32k | `completion('openai/gpt-4-32k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| anthropic/claude-2 | `completion('anthropic/claude-2', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| anthropic/claude-instant-v1 | `completion('anthropic/claude-instant-v1', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| google/palm-2-chat-bison | `completion('google/palm-2-chat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| google/palm-2-codechat-bison | `completion('google/palm-2-codechat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| meta-llama/llama-2-13b-chat | `completion('meta-llama/llama-2-13b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| meta-llama/llama-2-70b-chat | `completion('meta-llama/llama-2-70b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
|
||||
### Petals Models
|
||||
Supported models on https://chat.petals.dev/
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|----------------------|------------------------------------------------------------------------|--------------------------------|
|
||||
| stabilityai/StableBeluga2 | `completion(model='stabilityai/StableBeluga2', messages, custom_llm_provider="petals")` | No API Key required |
|
||||
| enoch/llama-65b-hf | `completion(model='enoch/llama-65b-hf', messages, custom_llm_provider="petals")` | No API Key required |
|
||||
| bigscience/bloomz | `completion(model='bigscience/bloomz', messages, custom_llm_provider="petals")` | No API Key required |
|
||||
|
||||
### Ollama Models
|
||||
Ollama supported models: https://github.com/jmorganca/ollama
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|----------------------|-----------------------------------------------------------------------------------|--------------------------------|
|
||||
| Llama2 7B | `completion(model='llama2', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 13B | `completion(model='llama2:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 70B | `completion(model='llama2:70b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 Uncensored | `completion(model='llama2-uncensored', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Orca Mini | `completion(model='orca-mini', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Vicuna | `completion(model='vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Nous-Hermes | `completion(model='nous-hermes', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Nous-Hermes 13B | `completion(model='nous-hermes:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Wizard Vicuna Uncensored | `completion(model='wizard-vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
19
docs/my-website/docs/providers/ai21.md
Normal file
19
docs/my-website/docs/providers/ai21.md
Normal file
|
@ -0,0 +1,19 @@
|
|||
# AI21
|
||||
|
||||
LiteLLM supports j2-light, j2-mid and j2-ultra from [AI21](https://www.ai21.com/studio/pricing).
|
||||
|
||||
They're available to use without a waitlist.
|
||||
|
||||
### API KEYS
|
||||
```python
|
||||
import os
|
||||
os.environ["AI21_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### AI21 Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| j2-light | `completion('j2-light', messages)` | `os.environ['AI21_API_KEY']` |
|
||||
| j2-mid | `completion('j2-mid', messages)` | `os.environ['AI21_API_KEY']` |
|
||||
| j2-ultra | `completion('j2-ultra', messages)` | `os.environ['AI21_API_KEY']` |
|
23
docs/my-website/docs/providers/aleph_alpha.md
Normal file
23
docs/my-website/docs/providers/aleph_alpha.md
Normal file
|
@ -0,0 +1,23 @@
|
|||
# Aleph Alpha
|
||||
|
||||
LiteLLM supports the 'luminous' + 'luminous-control' series of models from [Aleph Alpha](https://www.aleph-alpha.com/).
|
||||
|
||||
Like AI21 and Cohere, you can use these models without a waitlist.
|
||||
|
||||
### API KEYS
|
||||
```python
|
||||
import os
|
||||
os.environ["ALEPHALPHA_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### Aleph Alpha Models
|
||||
https://www.aleph-alpha.com/
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| luminous-base | `completion(model='luminous-base', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-base-control | `completion(model='luminous-base-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-extended | `completion(model='luminous-extended', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-extended-control | `completion(model='luminous-extended-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-supreme | `completion(model='luminous-supreme', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
||||
| luminous-supreme-control | `completion(model='luminous-supreme-control', messages=messages)` | `os.environ['ALEPHALPHA_API_KEY']` |
|
18
docs/my-website/docs/providers/anthropic.md
Normal file
18
docs/my-website/docs/providers/anthropic.md
Normal file
|
@ -0,0 +1,18 @@
|
|||
# Anthropic
|
||||
LiteLLM supports Claude-1, 1.2 and Claude-2.
|
||||
|
||||
### API KEYS
|
||||
```
|
||||
import os
|
||||
|
||||
os.environ["ANTHROPIC_API_KEY"] = ""
|
||||
```
|
||||
|
||||
|
||||
### Model Details
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| claude-instant-1 | `completion('claude-instant-1', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
||||
| claude-instant-1.2 | `completion('claude-instant-1.2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
||||
| claude-2 | `completion('claude-2', messages)` | `os.environ['ANTHROPIC_API_KEY']` |
|
19
docs/my-website/docs/providers/aws_sagemaker.md
Normal file
19
docs/my-website/docs/providers/aws_sagemaker.md
Normal file
|
@ -0,0 +1,19 @@
|
|||
# AWS Sagemaker
|
||||
LiteLLM supports Llama2 on Sagemaker
|
||||
|
||||
### API KEYS
|
||||
```python
|
||||
!pip install boto3
|
||||
|
||||
os.environ["AWS_ACCESS_KEY_ID"] = ""
|
||||
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
|
||||
os.environ["AWS_REGION_NAME"] = ""
|
||||
```
|
||||
|
||||
### AWS Sagemaker Models
|
||||
Here's an example of using a sagemaker model with LiteLLM
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| Llama2 7B | `completion(model='sagemaker/jumpstart-dft-meta-textgeneration-llama-2-7b, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
|
||||
| Custom LLM Endpoint | `completion(model='sagemaker/your-endpoint, messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
|
32
docs/my-website/docs/providers/azure.md
Normal file
32
docs/my-website/docs/providers/azure.md
Normal file
|
@ -0,0 +1,32 @@
|
|||
# Azure
|
||||
LiteLLM supports Azure Chat + Embedding calls.
|
||||
|
||||
### API KEYS
|
||||
```
|
||||
import os
|
||||
|
||||
os.environ["AZURE_API_KEY"] = ""
|
||||
os.environ["AZURE_API_BASE"] = ""
|
||||
os.environ["AZURE_API_VERSION"] = ""
|
||||
```
|
||||
|
||||
|
||||
### Azure OpenAI Chat Completion Models
|
||||
|
||||
```python
|
||||
|
||||
from litellm import completion
|
||||
|
||||
## set ENV variables
|
||||
os.environ["OPENAI_API_KEY"] = "openai key"
|
||||
os.environ["AZURE_API_KEY"] = "azure key"
|
||||
|
||||
|
||||
messages = [{ "content": "Hello, how are you?","role": "user"}]
|
||||
|
||||
# openai call
|
||||
response = completion(model="gpt-3.5-turbo", messages=messages)
|
||||
|
||||
# azure call
|
||||
response = completion("azure/<your_deployment_id>", messages)
|
||||
```
|
23
docs/my-website/docs/providers/baseten.md
Normal file
23
docs/my-website/docs/providers/baseten.md
Normal file
|
@ -0,0 +1,23 @@
|
|||
# Baseten
|
||||
LiteLLM supports any Text-Gen-Interface models on Baseten.
|
||||
|
||||
[Here's a tutorial on deploying a huggingface TGI model (Llama2, CodeLlama, WizardCoder, Falcon, etc.) on Baseten](https://truss.baseten.co/examples/performance/tgi-server)
|
||||
|
||||
### API KEYS
|
||||
```python
|
||||
import os
|
||||
os.environ["BASETEN_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### Baseten Models
|
||||
Baseten provides infrastructure to deploy and serve ML models https://www.baseten.co/. Use liteLLM to easily call models deployed on Baseten.
|
||||
|
||||
Example Baseten Usage - Note: liteLLM supports all models deployed on Baseten
|
||||
|
||||
Usage: Pass `model=baseten/<Model ID>`
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|------------------------------------|
|
||||
| Falcon 7B | `completion(model='baseten/qvv0xeq', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
||||
| Wizard LM | `completion(model='baseten/q841o8w', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
||||
| MPT 7B Base | `completion(model='baseten/31dxrj3', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
|
32
docs/my-website/docs/providers/cohere.md
Normal file
32
docs/my-website/docs/providers/cohere.md
Normal file
|
@ -0,0 +1,32 @@
|
|||
# Cohere
|
||||
|
||||
LiteLLM supports 'command', 'command-light', 'command-medium', 'command-medium-beta', 'command-xlarge-beta', 'command-nightly' models from [Cohere](https://cohere.com/).
|
||||
|
||||
Like AI21, these models are available without a waitlist.
|
||||
|
||||
### API KEYS
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ["COHERE_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### Example Usage
|
||||
|
||||
```python
|
||||
|
||||
from litellm import completion
|
||||
|
||||
## set ENV variables
|
||||
os.environ["OPENAI_API_KEY"] = "openai key"
|
||||
os.environ["COHERE_API_KEY"] = "cohere key"
|
||||
|
||||
|
||||
messages = [{ "content": "Hello, how are you?","role": "user"}]
|
||||
|
||||
# openai call
|
||||
response = completion(model="gpt-3.5-turbo", messages=messages)
|
||||
|
||||
# cohere call
|
||||
response = completion("command-nightly", messages)
|
||||
```
|
17
docs/my-website/docs/providers/ollama.md
Normal file
17
docs/my-website/docs/providers/ollama.md
Normal file
|
@ -0,0 +1,17 @@
|
|||
# Ollama
|
||||
LiteLLM supports all models from [Ollama](https://github.com/jmorganca/ollama)
|
||||
|
||||
### Ollama Models
|
||||
Ollama supported models: https://github.com/jmorganca/ollama
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|----------------------|-----------------------------------------------------------------------------------|--------------------------------|
|
||||
| Llama2 7B | `completion(model='llama2', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 13B | `completion(model='llama2:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 70B | `completion(model='llama2:70b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Llama2 Uncensored | `completion(model='llama2-uncensored', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Orca Mini | `completion(model='orca-mini', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Vicuna | `completion(model='vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Nous-Hermes | `completion(model='nous-hermes', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Nous-Hermes 13B | `completion(model='nous-hermes:13b', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
||||
| Wizard Vicuna Uncensored | `completion(model='wizard-vicuna', messages, api_base="http://localhost:11434", custom_llm_provider="ollama", stream=True)` | No API Key required |
|
38
docs/my-website/docs/providers/openai.md
Normal file
38
docs/my-website/docs/providers/openai.md
Normal file
|
@ -0,0 +1,38 @@
|
|||
# OpenAI
|
||||
LiteLLM supports OpenAI Chat + Text completion and embedding calls.
|
||||
|
||||
### API KEYS
|
||||
```
|
||||
import os
|
||||
|
||||
os.environ["OPENAI_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### OpenAI Chat Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|----------------------------------------|--------------------------------------|
|
||||
| gpt-3.5-turbo | `completion('gpt-3.5-turbo', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-0301 | `completion('gpt-3.5-turbo-0301', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-0613 | `completion('gpt-3.5-turbo-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-16k | `completion('gpt-3.5-turbo-16k', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-3.5-turbo-16k-0613 | `completion('gpt-3.5-turbo-16k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4 | `completion('gpt-4', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-0314 | `completion('gpt-4-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-0613 | `completion('gpt-4-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k | `completion('gpt-4-32k', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k-0314 | `completion('gpt-4-32k-0314', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| gpt-4-32k-0613 | `completion('gpt-4-32k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
|
||||
These also support the `OPENAI_API_BASE` environment variable, which can be used to specify a custom API endpoint.
|
||||
|
||||
### OpenAI Text Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| text-davinci-003 | `completion('text-davinci-003', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| ada-001 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| curie-001 | `completion('curie-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| babbage-001 | `completion('babbage-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| babbage-002 | `completion('ada-001', messages)` | `os.environ['OPENAI_API_KEY']` |
|
||||
| davinci-002 | `completion('davinci-002', messages)` | `os.environ['OPENAI_API_KEY']` |
|
24
docs/my-website/docs/providers/openrouter.md
Normal file
24
docs/my-website/docs/providers/openrouter.md
Normal file
|
@ -0,0 +1,24 @@
|
|||
# OpenRouter
|
||||
LiteLLM supports all the text models from [OpenRouter](https://openrouter.ai/docs)
|
||||
|
||||
### API KEYS
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ["OPENROUTER_API_KEYS"] = ""
|
||||
```
|
||||
|
||||
### OpenRouter Completion Models
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|------------------|--------------------------------------------|--------------------------------------|
|
||||
| openai/gpt-3.5-turbo | `completion('openai/gpt-3.5-turbo', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-3.5-turbo-16k | `completion('openai/gpt-3.5-turbo-16k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-4 | `completion('openai/gpt-4', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| openai/gpt-4-32k | `completion('openai/gpt-4-32k', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| anthropic/claude-2 | `completion('anthropic/claude-2', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| anthropic/claude-instant-v1 | `completion('anthropic/claude-instant-v1', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| google/palm-2-chat-bison | `completion('google/palm-2-chat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| google/palm-2-codechat-bison | `completion('google/palm-2-codechat-bison', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| meta-llama/llama-2-13b-chat | `completion('meta-llama/llama-2-13b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
||||
| meta-llama/llama-2-70b-chat | `completion('meta-llama/llama-2-70b-chat', messages)` | `os.environ['OR_SITE_URL']`,`os.environ['OR_APP_NAME']`,`os.environ['OPENROUTER_API_KEY']` |
|
25
docs/my-website/docs/providers/replicate.md
Normal file
25
docs/my-website/docs/providers/replicate.md
Normal file
|
@ -0,0 +1,25 @@
|
|||
# Replicate
|
||||
|
||||
LiteLLM supports all models on Replicate
|
||||
|
||||
### API KEYS
|
||||
```python
|
||||
import os
|
||||
os.environ["REPLICATE_API_KEY"] = ""
|
||||
```
|
||||
|
||||
|
||||
### Replicate Models
|
||||
liteLLM supports all replicate LLMs. For replicate models ensure to add a `replicate` prefix to the `model` arg. liteLLM detects it using this arg.
|
||||
Below are examples on how to call replicate LLMs using liteLLM
|
||||
|
||||
Model Name | Function Call | Required OS Variables |
|
||||
-----------------------------|----------------------------------------------------------------|--------------------------------------|
|
||||
replicate/llama-2-70b-chat | `completion('replicate/replicate/llama-2-70b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
a16z-infra/llama-2-13b-chat| `completion('replicate/a16z-infra/llama-2-13b-chat', messages)`| `os.environ['REPLICATE_API_KEY']` |
|
||||
joehoover/instructblip-vicuna13b | `completion('replicate/joehoover/instructblip-vicuna13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replicate/dolly-v2-12b | `completion('replicate/replicate/dolly-v2-12b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
a16z-infra/llama-2-7b-chat | `completion('replicate/a16z-infra/llama-2-7b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replicate/vicuna-13b | `completion('replicate/replicate/vicuna-13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
daanelson/flan-t5-large | `completion('replicate/daanelson/flan-t5-large', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
||||
replit/replit-code-v1-3b | `completion('replicate/replit/replit-code-v1-3b', messages)` | `os.environ['REPLICATE_API_KEY']` |
|
22
docs/my-website/docs/providers/togetherai.md
Normal file
22
docs/my-website/docs/providers/togetherai.md
Normal file
|
@ -0,0 +1,22 @@
|
|||
# Together AI
|
||||
LiteLLM supports all models on Together AI.
|
||||
|
||||
### API KEYS
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ["TOGETHERAI_API_KEY"] = ""
|
||||
```
|
||||
|
||||
### Together AI Models
|
||||
liteLLM supports `non-streaming` and `streaming` requests to all models on https://api.together.xyz/
|
||||
|
||||
Example TogetherAI Usage - Note: liteLLM supports all models deployed on TogetherAI
|
||||
|
||||
| Model Name | Function Call | Required OS Variables |
|
||||
|-----------------------------------|------------------------------------------------------------------------|---------------------------------|
|
||||
| togethercomputer/llama-2-70b-chat | `completion('togethercomputer/llama-2-70b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/LLaMA-2-13b-chat | `completion('togethercomputer/LLaMA-2-13b-chat', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/code-and-talk-v1 | `completion('togethercomputer/code-and-talk-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/creative-v1 | `completion('togethercomputer/creative-v1', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
||||
| togethercomputer/yourmodel | `completion('togethercomputer/yourmodel', messages)` | `os.environ['TOGETHERAI_API_KEY']` |
|
24
docs/my-website/docs/providers/vertex.md
Normal file
24
docs/my-website/docs/providers/vertex.md
Normal file
|
@ -0,0 +1,24 @@
|
|||
# Google Palm/VertexAI
|
||||
LiteLLM supports chat-bison, chat-bison@001, text-bison, text-bison@001
|
||||
|
||||
### Google VertexAI Models
|
||||
Sample notebook for calling VertexAI models: https://github.com/BerriAI/litellm/blob/main/cookbook/liteLLM_VertextAI_Example.ipynb
|
||||
|
||||
All calls using Vertex AI require the following parameters:
|
||||
* Your Project ID
|
||||
`litellm.vertex_project` = "hardy-device-38811" Your Project ID
|
||||
* Your Project Location
|
||||
`litellm.vertex_location` = "us-central1"
|
||||
|
||||
Authentication:
|
||||
VertexAI uses Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information on setting this up
|
||||
|
||||
VertexAI requires you to set `application_default_credentials.json`, this can be set by running `gcloud auth application-default login` in your terminal
|
||||
|
||||
| Model Name | Function Call |
|
||||
|------------------|----------------------------------------------------------|
|
||||
| chat-bison | `completion('chat-bison', messages)` |
|
||||
| chat-bison@001 | `completion('chat-bison@001', messages)` |
|
||||
| text-bison | `completion('text-bison', messages)` |
|
||||
| text-bison@001 | `completion('text-bison@001', messages)` |
|
||||
|
|
@ -29,18 +29,17 @@ const sidebars = {
|
|||
label: "Embedding()",
|
||||
items: ["embedding/supported_embedding"],
|
||||
},
|
||||
'completion/supported',
|
||||
// {
|
||||
// type: "category",
|
||||
// label: "Providers",
|
||||
// link: {
|
||||
// type: 'generated-index',
|
||||
// title: 'Providers',
|
||||
// description: 'Learn how to deploy + call models from different providers on LiteLLM',
|
||||
// slug: '/providers',
|
||||
// },
|
||||
// items: ["providers/huggingface"],
|
||||
// },
|
||||
{
|
||||
type: "category",
|
||||
label: "Supported Models + Providers",
|
||||
link: {
|
||||
type: 'generated-index',
|
||||
title: 'Providers',
|
||||
description: 'Learn how to deploy + call models from different providers on LiteLLM',
|
||||
slug: '/providers',
|
||||
},
|
||||
items: ["providers/huggingface", "providers/openai", "providers/azure", "providers/vertex", "providers/anthropic", "providers/ai21", "providers/replicate", "providers/cohere", "providers/togetherai", "providers/aws_sagemaker", "providers/aleph_alpha", "providers/baseten", "providers/openrouter", "providers/ollama"]
|
||||
},
|
||||
"token_usage",
|
||||
"exception_mapping",
|
||||
'debugging/local_debugging',
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue