llama-stack-mirror/requirements.txt
IAN MILLER b57db11bed
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Test Llama Stack Build / build-single-provider (push) Failing after 7s
Integration Tests / discover-tests (push) Successful in 13s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Integration Tests / test-matrix (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s
Test External Providers / test-external-providers (venv) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 35s
Python Package Build Test / build (3.12) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 57s
Unit Tests / unit-tests (3.13) (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m42s
feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this task is to create a solution that can automatically
detect when new models are added, deprecated, or removed by OpenAI and
Llama API providers, and automatically update the list of supported
models in LLamaStack.

This feature is vitally important in order to avoid missing new models
and editing the entries manually hence I created automation allowing
users to dynamically register:
- any models from OpenAI provider available at 
[https://api.openai.com/v1/models](https://api.openai.com/v1/models)
that are not in
[https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py)

- any models from Llama API provider available at
[https://api.llama.com/v1/models](https://api.llama.com/v1/models) that
are not in
[https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py)

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2504

this PR is dependant on #2710

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

1. Create venv at root llamastack directory:
`uv venv .venv --python 3.12 --seed`    
2. Activate venv:
`source .venv/bin/activate`   
3. `uv pip install -e .`
4. Create OpenAI distro modifying run.yaml
5. Build distro:
`llama stack build --template starter --image-type venv`
6. Then run LlamaStack, but before navigate to templates/starter folder:
`llama stack run run.yaml --image-type venv OPENAI_API_KEY=<YOUR_KEY>
ENABLE_OPENAI=openai`
7. Then try to register dummy llm that doesn't exist in OpenAI provider:
` llama-stack-client models register ianm/ianllm
--provider-model-id=ianllm --provider-id=openai `
 
You should receive this output - combined list of static config +
fetched available models from OpenAI:
 
<img width="1380" height="474" alt="Screenshot 2025-07-14 at 12 48 50"
src="https://github.com/user-attachments/assets/d26aad18-6b15-49ee-9c49-b01b2d33f883"
/>

8. Then register real llm from OpenAI:
llama-stack-client models register openai/gpt-4-turbo-preview
--provider-model-id=gpt-4-turbo-preview --provider-id=openai

<img width="1253" height="613" alt="Screenshot 2025-07-14 at 13 43 02"
src="https://github.com/user-attachments/assets/60a5c9b1-3468-4eb9-9e92-cd7d21de3ca0"
/>
<img width="1288" height="655" alt="Screenshot 2025-07-14 at 13 43 11"
src="https://github.com/user-attachments/assets/c1e48871-0e24-4bd9-a0b8-8c95552a51ee"
/>

We correctly fetched all available models from OpenAI

As for Llama API, as a non-US person I don't have access to Llama API
Key but I joined wait list. The implementation for Llama is the same as
for OpenAI since Llama is openai compatible. So, the response from GET
endpoint has the same structure as OpenAI
https://llama.developer.meta.com/docs/api/models
2025-07-16 12:49:38 -04:00

269 lines
5.5 KiB
Text

# This file was autogenerated by uv via the following command:
# uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt
aiohappyeyeballs==2.5.0
# via aiohttp
aiohttp==3.12.13
# via llama-stack
aiosignal==1.3.2
# via aiohttp
aiosqlite==0.21.0
# via llama-stack
annotated-types==0.7.0
# via pydantic
anyio==4.8.0
# via
# httpx
# llama-api-client
# llama-stack-client
# openai
# starlette
asyncpg==0.30.0
# via llama-stack
attrs==25.1.0
# via
# aiohttp
# jsonschema
# referencing
certifi==2025.1.31
# via
# httpcore
# httpx
# requests
cffi==1.17.1 ; platform_python_implementation != 'PyPy'
# via cryptography
charset-normalizer==3.4.1
# via requests
click==8.1.8
# via
# llama-stack-client
# uvicorn
colorama==0.4.6 ; sys_platform == 'win32'
# via
# click
# tqdm
cryptography==45.0.5
# via python-jose
deprecated==1.2.18
# via
# opentelemetry-api
# opentelemetry-exporter-otlp-proto-http
# opentelemetry-semantic-conventions
distro==1.9.0
# via
# llama-api-client
# llama-stack-client
# openai
ecdsa==0.19.1
# via python-jose
fastapi==0.115.8
# via llama-stack
filelock==3.17.0
# via huggingface-hub
fire==0.7.0
# via
# llama-stack
# llama-stack-client
frozenlist==1.5.0
# via
# aiohttp
# aiosignal
fsspec==2024.12.0
# via huggingface-hub
googleapis-common-protos==1.67.0
# via opentelemetry-exporter-otlp-proto-http
h11==0.16.0
# via
# httpcore
# llama-stack
# uvicorn
hf-xet==1.1.5 ; platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'
# via huggingface-hub
httpcore==1.0.9
# via httpx
httpx==0.28.1
# via
# llama-api-client
# llama-stack
# llama-stack-client
# openai
huggingface-hub==0.33.0
# via llama-stack
idna==3.10
# via
# anyio
# httpx
# requests
# yarl
importlib-metadata==8.5.0
# via opentelemetry-api
jinja2==3.1.6
# via llama-stack
jiter==0.8.2
# via openai
jsonschema==4.23.0
# via llama-stack
jsonschema-specifications==2024.10.1
# via jsonschema
llama-api-client==0.1.2
# via llama-stack
llama-stack-client==0.2.15
# via llama-stack
markdown-it-py==3.0.0
# via rich
markupsafe==3.0.2
# via jinja2
mdurl==0.1.2
# via markdown-it-py
multidict==6.1.0
# via
# aiohttp
# yarl
numpy==2.2.3
# via pandas
openai==1.71.0
# via llama-stack
opentelemetry-api==1.30.0
# via
# opentelemetry-exporter-otlp-proto-http
# opentelemetry-sdk
# opentelemetry-semantic-conventions
opentelemetry-exporter-otlp-proto-common==1.30.0
# via opentelemetry-exporter-otlp-proto-http
opentelemetry-exporter-otlp-proto-http==1.30.0
# via llama-stack
opentelemetry-proto==1.30.0
# via
# opentelemetry-exporter-otlp-proto-common
# opentelemetry-exporter-otlp-proto-http
opentelemetry-sdk==1.30.0
# via
# llama-stack
# opentelemetry-exporter-otlp-proto-http
opentelemetry-semantic-conventions==0.51b0
# via opentelemetry-sdk
packaging==24.2
# via huggingface-hub
pandas==2.2.3
# via llama-stack-client
pillow==11.1.0
# via llama-stack
prompt-toolkit==3.0.50
# via
# llama-stack
# llama-stack-client
propcache==0.3.0
# via
# aiohttp
# yarl
protobuf==5.29.5
# via
# googleapis-common-protos
# opentelemetry-proto
pyaml==25.1.0
# via llama-stack-client
pyasn1==0.4.8
# via
# python-jose
# rsa
pycparser==2.22 ; platform_python_implementation != 'PyPy'
# via cffi
pydantic==2.10.6
# via
# fastapi
# llama-api-client
# llama-stack
# llama-stack-client
# openai
pydantic-core==2.27.2
# via pydantic
pygments==2.19.1
# via rich
python-dateutil==2.9.0.post0
# via pandas
python-dotenv==1.0.1
# via llama-stack
python-jose==3.4.0
# via llama-stack
python-multipart==0.0.20
# via llama-stack
pytz==2025.1
# via pandas
pyyaml==6.0.2
# via
# huggingface-hub
# pyaml
referencing==0.36.2
# via
# jsonschema
# jsonschema-specifications
regex==2024.11.6
# via tiktoken
requests==2.32.4
# via
# huggingface-hub
# llama-stack-client
# opentelemetry-exporter-otlp-proto-http
# tiktoken
rich==13.9.4
# via
# llama-stack
# llama-stack-client
rpds-py==0.22.3
# via
# jsonschema
# referencing
rsa==4.9
# via python-jose
six==1.17.0
# via
# ecdsa
# python-dateutil
sniffio==1.3.1
# via
# anyio
# llama-api-client
# llama-stack-client
# openai
starlette==0.45.3
# via
# fastapi
# llama-stack
termcolor==2.5.0
# via
# fire
# llama-stack
# llama-stack-client
tiktoken==0.9.0
# via llama-stack
tqdm==4.67.1
# via
# huggingface-hub
# llama-stack-client
# openai
typing-extensions==4.12.2
# via
# aiosqlite
# anyio
# fastapi
# huggingface-hub
# llama-api-client
# llama-stack-client
# openai
# opentelemetry-sdk
# pydantic
# pydantic-core
# referencing
tzdata==2025.1
# via pandas
urllib3==2.5.0
# via requests
uvicorn==0.34.0
# via llama-stack
wcwidth==0.2.13
# via prompt-toolkit
wrapt==1.17.2
# via deprecated
yarl==1.18.3
# via aiohttp
zipp==3.21.0
# via importlib-metadata