llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
Dinesh Yeduguru	57a9b4d57f	Allow models to be registered as long as llama model is provided (#472 ) This PR allows models to be registered with provider as long as the user specifies a llama model, even though the model does not match our prebuilt provider specific mapping. Test: pytest -v -s llama_stack/providers/tests/inference/test_model_registration.py -m "together" --env TOGETHER_API_KEY=<KEY> --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-18 15:05:29 -08:00
Dinesh Yeduguru	0850ad656a	unregister for memory banks and remove update API (#458 ) The semantics of an Update on resources is very tricky to reason about especially for memory banks and models. The best way to go forward here is for the user to unregister and register a new resource. We don't have a compelling reason to support update APIs. Tests: pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m "chroma" --env CHROMA_HOST=localhost --env CHROMA_PORT=8000 pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m "pgvector" --env PGVECTOR_DB=postgres --env PGVECTOR_USER=postgres --env PGVECTOR_PASSWORD=mysecretpassword --env PGVECTOR_HOST=0.0.0.0 $CONDA_PREFIX/bin/pytest -v -s -m "ollama" llama_stack/providers/tests/inference/test_model_registration.py --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-14 17:12:11 -08:00
Dinesh Yeduguru	efe791bab7	Support model resource updates and deletes (#452 ) # What does this PR do? * Changes the registry to store only one RoutableObject per identifier. Before it was a list, which is not really required. * Adds impl for updates and deletes * Updates routing table to handle updates correctly ## Test Plan ``` ❯ llama-stack-client models list +------------------------+---------------+------------------------------------+------------+ \| identifier \| provider_id \| provider_resource_id \| metadata \| +========================+===============+====================================+============+ \| Llama3.1-405B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-405b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.1-8B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-8b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.2-3B-Instruct \| fireworks-0 \| fireworks/llama-v3p2-1b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ ❯ llama-stack-client models register dineshyv-model --provider-model-id=fireworks/llama-v3p1-70b-instruct Successfully registered model dineshyv-model ❯ llama-stack-client models list +------------------------+---------------+------------------------------------+------------+ \| identifier \| provider_id \| provider_resource_id \| metadata \| +========================+===============+====================================+============+ \| Llama3.1-405B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-405b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.1-8B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-8b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.2-3B-Instruct \| fireworks-0 \| fireworks/llama-v3p2-1b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| dineshyv-model \| fireworks-0 \| fireworks/llama-v3p1-70b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ ❯ llama-stack-client models update dineshyv-model --provider-model-id=fireworks/llama-v3p1-405b-instruct Successfully updated model dineshyv-model ❯ llama-stack-client models list +------------------------+---------------+------------------------------------+------------+ \| identifier \| provider_id \| provider_resource_id \| metadata \| +========================+===============+====================================+============+ \| Llama3.1-405B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-405b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.1-8B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-8b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.2-3B-Instruct \| fireworks-0 \| fireworks/llama-v3p2-1b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| dineshyv-model \| fireworks-0 \| fireworks/llama-v3p1-405b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ llama-stack-client models delete dineshyv-model ❯ llama-stack-client models list +------------------------+---------------+------------------------------------+------------+ \| identifier \| provider_id \| provider_resource_id \| metadata \| +========================+===============+====================================+============+ \| Llama3.1-405B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-405b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.1-8B-Instruct \| fireworks-0 \| fireworks/llama-v3p1-8b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ \| Llama3.2-3B-Instruct \| fireworks-0 \| fireworks/llama-v3p2-1b-instruct \| {} \| +------------------------+---------------+------------------------------------+------------+ ``` --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-13 21:55:41 -08:00
Dinesh Yeduguru	787e2034b7	model registration in ollama and vllm check against the available models in the provider (#446 ) tests: pytest -v -s -m "ollama" llama_stack/providers/tests/inference/test_text_inference.py pytest -v -s -m vllm_remote llama_stack/providers/tests/inference/test_text_inference.py --env VLLM_URL="http://localhost:9798/v1" ---------	2024-11-13 13:04:06 -08:00

4 commits