mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-30 03:44:20 +00:00
This gets the file_search verification test working against ollama, fireworks, and api.openai.com. We don't have the entirety of the vector store API implemented in Llama Stack yet, so this still has a bit of a hack to swap between using only OpenAI-compatible APIs versus using the LlamaStackClient to insert content into our vector stores. Outside of actually inserting file contents, the rest of the test works the same and uses only the OpenAI client for all of these providers. How to run the tests: Ollama (sometimes flakes with small model): ``` ollama run llama3.2:3b INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run ./llama_stack/templates/ollama/run.yaml \ --image-type venv \ --env OLLAMA_URL="http://0.0.0.0:11434" pytest -sv \ 'tests/verifications/openai_api/test_responses.py::test_response_non_streaming_file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model meta-llama/Llama-3.2-3B-Instruct ``` Fireworks via Llama Stack: ``` llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv \ 'tests/verifications/openai_api/test_responses.py::test_response_non_streaming_file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model meta-llama/Llama-3.3-70B-Instruct ``` OpenAI directly: ``` pytest -sv \ 'tests/verifications/openai_api/test_responses.py::test_response_non_streaming_file_search' \ --base-url=https://api.openai.com/v1 \ --model gpt-4o ``` Signed-off-by: Ben Browning <bbrownin@redhat.com> |
||
---|---|---|
.. | ||
_static | ||
notebooks | ||
openapi_generator | ||
resources | ||
source | ||
zero_to_hero_guide | ||
conftest.py | ||
contbuild.sh | ||
dog.jpg | ||
getting_started.ipynb | ||
getting_started_llama4.ipynb | ||
getting_started_llama_api.ipynb | ||
license_header.txt | ||
make.bat | ||
Makefile | ||
readme.md |
Llama Stack Documentation
Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.
Render locally
From the llama-stack root directory, run the following command to render the docs locally:
uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all
You can open up the docs in your browser at http://localhost:8000
Content
Try out Llama Stack's capabilities through our detailed Jupyter notebooks:
- Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
- Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
- Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack