forked from phoenix/litellm-mirror
adding docs
This commit is contained in:
parent
6aff47083b
commit
1ca4d6ff8c
1 changed files with 73 additions and 3 deletions
|
@ -24,10 +24,11 @@ completion(model=model, messages=messages, custom_llm_provider="huggingface")
|
||||||
```
|
```
|
||||||
|
|
||||||
What's happening?
|
What's happening?
|
||||||
- model - this is the name of the deployed model on huggingface
|
- model: This is the name of the deployed model on huggingface
|
||||||
- messages - this is the input. We accept the OpenAI chat format. For huggingface, by default we iterate through the list and add the message["content"] to the prompt.
|
- messages: This is the input. We accept the OpenAI chat format. For huggingface, by default we iterate through the list and add the message["content"] to the prompt. [Relevant Code](https://github.com/BerriAI/litellm/blob/6aff47083be659b80e00cb81eb783cb24db2e183/litellm/llms/huggingface_restapi.py#L46)
|
||||||
|
- custom_llm_provider: Optional param. This is an optional flag, needed only for Azure, Replicate, Huggingface and Together-ai (platforms where you deploy your own models). This enables litellm to route to the right provider, for your model.
|
||||||
|
|
||||||
### Case 2: Call Llama2 public endpoint
|
### Case 2: Call Llama2 public Huggingface endpoint
|
||||||
|
|
||||||
We've deployed `meta-llama/Llama-2-7b-hf` behind a public endpoint - `https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud`.
|
We've deployed `meta-llama/Llama-2-7b-hf` behind a public endpoint - `https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud`.
|
||||||
|
|
||||||
|
@ -43,3 +44,72 @@ custom_api_base = "https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.
|
||||||
completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base)
|
completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
What's happening?
|
||||||
|
- custom_api_base: Optional param. Since this uses a deployed endpoint (not the [default huggingface inference endpoint](https://github.com/BerriAI/litellm/blob/6aff47083be659b80e00cb81eb783cb24db2e183/litellm/llms/huggingface_restapi.py#L35)), we pass that to LiteLLM.
|
||||||
|
|
||||||
|
### Case 3: Call Llama2 private Huggingface endpoint
|
||||||
|
|
||||||
|
The only difference between this and the public endpoint, is that you need an `api_key` for this.
|
||||||
|
|
||||||
|
On LiteLLM there's 3 ways you can pass in an api_key.
|
||||||
|
|
||||||
|
Either via environment variables, by setting it as a package variable or when calling `completion()`.
|
||||||
|
|
||||||
|
**Setting via environment variables**
|
||||||
|
Here's the 1 line of code you need to add
|
||||||
|
```
|
||||||
|
os.environ["HF_TOKEN] = "..."
|
||||||
|
```
|
||||||
|
|
||||||
|
Here's the full code:
|
||||||
|
```
|
||||||
|
from litellm import completion
|
||||||
|
|
||||||
|
os.environ["HF_TOKEN] = "..."
|
||||||
|
|
||||||
|
model = "meta-llama/Llama-2-7b-hf"
|
||||||
|
messages = [{"role": "user", "content": "Hey, how's it going?"}] # LiteLLM follows the OpenAI format
|
||||||
|
custom_api_base = "https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud"
|
||||||
|
|
||||||
|
### CALLING ENDPOINT
|
||||||
|
completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Setting it as package variable**
|
||||||
|
Here's the 1 line of code you need to add
|
||||||
|
```
|
||||||
|
litellm.huggingface_key = "..."
|
||||||
|
```
|
||||||
|
|
||||||
|
Here's the full code:
|
||||||
|
```
|
||||||
|
import litellm
|
||||||
|
from litellm import completion
|
||||||
|
|
||||||
|
litellm.huggingface_key = "..."
|
||||||
|
|
||||||
|
model = "meta-llama/Llama-2-7b-hf"
|
||||||
|
messages = [{"role": "user", "content": "Hey, how's it going?"}] # LiteLLM follows the OpenAI format
|
||||||
|
custom_api_base = "https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud"
|
||||||
|
|
||||||
|
### CALLING ENDPOINT
|
||||||
|
completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Passed in during completion call**
|
||||||
|
```
|
||||||
|
completion(..., api_key="...")
|
||||||
|
```
|
||||||
|
|
||||||
|
Here's the full code:
|
||||||
|
|
||||||
|
```
|
||||||
|
from litellm import completion
|
||||||
|
|
||||||
|
model = "meta-llama/Llama-2-7b-hf"
|
||||||
|
messages = [{"role": "user", "content": "Hey, how's it going?"}] # LiteLLM follows the OpenAI format
|
||||||
|
custom_api_base = "https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud"
|
||||||
|
|
||||||
|
### CALLING ENDPOINT
|
||||||
|
completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base, api_key="...")
|
||||||
|
```
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue