forked from phoenix-oss/llama-stack-mirror
Add 3.3 70B to Ollama inference provider (#681)
# What does this PR do? Adds 3.3 70B support to Ollama inference provider ## Test Plan <details> <summary>Manual</summary> ```bash # 42GB to download ollama pull llama3.3:70b ollama run llama3.3:70b --keepalive 60m export LLAMA_STACK_PORT=5000 pip install -e . \ && llama stack build --template ollama --image-type conda \ && llama stack run ./distributions/ollama/run.yaml \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=Llama3.3-70B-Instruct \ --env OLLAMA_URL=http://localhost:11434 export LLAMA_STACK_PORT=5000 llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT \ inference chat-completion \ --model-id Llama3.3-70B-Instruct \ --message "hello, what model are you?" ``` <img width="1221" alt="image" src="https://github.com/user-attachments/assets/dcffbdd9-94c8-4d47-9f95-4ef6c3756294" /> </details> ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.
This commit is contained in:
parent
fa371fdc9e
commit
21fb92d7cf
1 changed files with 4 additions and 0 deletions
|
@ -100,6 +100,10 @@ model_aliases = [
|
|||
"llama3.2-vision:90b",
|
||||
CoreModelId.llama3_2_90b_vision_instruct.value,
|
||||
),
|
||||
build_model_alias(
|
||||
"llama3.3:70b",
|
||||
CoreModelId.llama3_3_70b_instruct.value,
|
||||
),
|
||||
# The Llama Guard models don't have their full fp16 versions
|
||||
# so we are going to alias their default version to the canonical SKU
|
||||
build_model_alias(
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue