llama-stack

forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe d9d271a684 Allow specifying resources in StackRunConfig (#425 ) # What does this PR do? This PR brings back the facility to not force registration of resources onto the user. This is not just annoying but actually not feasible sometimes. For example, you may have a Stack which boots up with private providers for inference for models A and B. There is no way for the user to actually know which model is being served by these providers now (to be able to register it.) How will this avoid the users needing to do registration? In a follow-up diff, I will make sure I update the sample run.yaml files so they list the models served by the distributions explicitly. So when users do `llama stack build --template <...>` and run it, their distributions come up with the right set of models they expect. For self-hosted distributions, it also allows us to have a place to explicit list the models that need to be served to make the "complete" stack (including safety, e.g.) ## Test Plan Started ollama locally with two lightweight models: Llama3.2-3B-Instruct and Llama-Guard-3-1B. Updated all the tests including agents. Here's the tests I ran so far: ```bash pytest -s -v -m "fireworks and llama_3b" test_text_inference.py::TestInference \ --env FIREWORKS_API_KEY=... pytest -s -v -m "ollama and llama_3b" test_text_inference.py::TestInference pytest -s -v -m ollama test_safety.py pytest -s -v -m faiss test_memory.py pytest -s -v -m ollama test_agents.py \ --inference-model=Llama3.2-3B-Instruct --safety-model=Llama-Guard-3-1B ``` Found a few bugs here and there pre-existing that these test runs fixed.		2024-11-12 10:58:49 -08:00
..
agents	migrate memory banks to Resource and new registration (#411 )	2024-11-11 17:10:44 -08:00
batch_inference	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
common	[Evals API][4/n] evals with generation meta-reference impl (#303 )	2024-10-25 13:12:39 -07:00
datasetio	migrate dataset to resource (#420 )	2024-11-11 17:14:41 -08:00
datasets	migrate dataset to resource (#420 )	2024-11-11 17:14:41 -08:00
eval	[Evals API][11/n] huggingface dataset provider + mmlu scoring fn (#392 )	2024-11-11 14:49:50 -05:00
eval_tasks	migrate evals to resource (#421 )	2024-11-11 17:24:03 -08:00
inference	migrate model to Resource and new registration signature (#410 )	2024-11-08 16:12:57 -08:00
inspect	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
memory	migrate memory banks to Resource and new registration (#411 )	2024-11-11 17:10:44 -08:00
memory_banks	migrate memory banks to Resource and new registration (#411 )	2024-11-11 17:10:44 -08:00
models	migrate model to Resource and new registration signature (#410 )	2024-11-08 16:12:57 -08:00
post_training	[Evals API][4/n] evals with generation meta-reference impl (#303 )	2024-10-25 13:12:39 -07:00
safety	Resource oriented design for shields (#399 )	2024-11-08 12:16:11 -08:00
scoring	migrate scoring fns to resource (#422 )	2024-11-11 17:28:48 -08:00
scoring_functions	migrate scoring fns to resource (#422 )	2024-11-11 17:28:48 -08:00
shields	Resource oriented design for shields (#399 )	2024-11-08 12:16:11 -08:00
synthetic_data_generation	[Evals API][4/n] evals with generation meta-reference impl (#303 )	2024-10-25 13:12:39 -07:00
telemetry	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
resource.py	Allow specifying resources in StackRunConfig (#425 )	2024-11-12 10:58:49 -08:00