mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-27 18:50:41 +00:00
# What does this PR do? * Provide sqlite implementation of the APIs introduced in https://github.com/meta-llama/llama-stack/pull/2145. * Introduced a SqlStore API: llama_stack/providers/utils/sqlstore/api.py and the first Sqlite implementation * Pagination support will be added in a future PR. ## Test Plan Unit test on sql store: <img width="1005" alt="image" src="https://github.com/user-attachments/assets/9b8b7ec8-632b-4667-8127-5583426b2e29" /> Integration test: ``` INFERENCE_MODEL="llama3.2:3b-instruct-fp16" llama stack build --template ollama --image-type conda --run ``` ``` LLAMA_STACK_CONFIG=http://localhost:5001 INFERENCE_MODEL="llama3.2:3b-instruct-fp16" python -m pytest -v tests/integration/inference/test_openai_completion.py --text-model "llama3.2:3b-instruct-fp16" -k 'inference_store and openai' ``` |
||
---|---|---|
.. | ||
bedrock | ||
cerebras | ||
ci-tests | ||
dell | ||
experimental-post-training | ||
fireworks | ||
groq | ||
hf-endpoint | ||
hf-serverless | ||
llama_api | ||
meta-reference-gpu | ||
nvidia | ||
ollama | ||
open-benchmark | ||
passthrough | ||
remote-vllm | ||
sambanova | ||
starter | ||
tgi | ||
together | ||
verification | ||
vllm-gpu | ||
watsonx | ||
__init__.py | ||
dependencies.json | ||
template.py |