# Llama2 - Huggingface Tutorial [Huggingface](https://huggingface.co/) is an open source platform to deploy machine-learnings models. ## Call Llama2 with Huggingface Inference Endpoints LiteLLM makes it easy to call your public, private or the default huggingface endpoints. In this case, let's try and call 3 models: - `deepset/deberta-v3-large-squad2`: calls the default huggingface endpoint - `meta-llama/Llama-2-7b-hf`: calls a public endpoint - `meta-llama/Llama-2-7b-chat-hf`: call your privat endpoint ### Case 1: Call default huggingface endpoint Here's the complete example: ``` from litellm import completion model = "deepset/deberta-v3-large-squad2" messages = [{"role": "user", "content": "Hey, how's it going?"}] # LiteLLM follows the OpenAI format ### CALLING ENDPOINT completion(model=model, messages=messages, custom_llm_provider="huggingface") ``` What's happening? - model - this is the name of the deployed model on huggingface - messages - this is the input. We accept the OpenAI chat format. For huggingface, by default we iterate through the list and add the message["content"] to the prompt. ### Case 2: Call Llama2 public endpoint We've deployed `meta-llama/Llama-2-7b-hf` behind a public endpoint - `https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud`. Let's try it out: ``` from litellm import completion model = "meta-llama/Llama-2-7b-hf" messages = [{"role": "user", "content": "Hey, how's it going?"}] # LiteLLM follows the OpenAI format custom_api_base = "https://ag3dkq4zui5nu8g3.us-east-1.aws.endpoints.huggingface.cloud" ### CALLING ENDPOINT completion(model=model, messages=messages, custom_llm_provider="huggingface", custom_api_base=custom_api_base) ```