llama-stack-mirror/docs
IAN MILLER 007efa6eb5
refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this PR is to replace the Llama Stack's default embedding
model by nomic-embed-text-v1.5.

These are the key reasons why Llama Stack community decided to switch
from all-MiniLM-L6-v2 to nomic-embed-text-v1.5:
1. The training data for
[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2#training-data)
includes a lot of data sets with various licensing terms, so it is
tricky to know when/whether it is appropriate to use this model for
commercial applications.
2. The model is not particularly competitive on major benchmarks. For
example, if you look at the [MTEB
Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and click
on Miscellaneous/BEIR to see English information retrieval accuracy, you
see that the top of the leaderboard is dominated by enormous models but
also that there are many, many models of relatively modest size whith
much higher Retrieval scores. If you want to look closely at the data, I
recommend clicking "Download Table" because it is easier to browse that
way.

More discussion info can be founded
[here](https://github.com/llamastack/llama-stack/issues/2418)

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2418 

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
1. Run `./scripts/unit-tests.sh`
2. Integration tests via CI wokrflow

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
2025-10-14 10:44:20 -04:00
..
docs refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183) 2025-10-14 10:44:20 -04:00
notebooks refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183) 2025-10-14 10:44:20 -04:00
openapi_generator chore: refactor (chat)completions endpoints to use shared params struct (#3761) 2025-10-10 15:46:34 -07:00
src docs: api separation (#3630) 2025-10-01 10:13:31 -07:00
static feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
supplementary docs: adding supplementary markdown content to API specs (#3632) 2025-10-01 10:15:30 -07:00
zero_to_hero_guide refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183) 2025-10-14 10:44:20 -04:00
docusaurus.config.ts docs: add favicon and mobile styling (#3650) 2025-10-02 10:42:54 +02:00
dog.jpg Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00
getting_started.ipynb refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183) 2025-10-14 10:44:20 -04:00
getting_started_llama4.ipynb chore!: remove model mgmt from CLI for Hugging Face CLI (#3700) 2025-10-09 16:50:33 -07:00
getting_started_llama_api.ipynb chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
license_header.txt Initial commit 2024-07-23 08:32:33 -07:00
original_rfc.md chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
package-lock.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
package.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
quick_start.ipynb chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
README.md docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
sidebars.ts docs: Update docs navbar config (#3653) 2025-10-02 16:48:38 +02:00
tsconfig.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our Github page.

Render locally

From the llama-stack docs/ directory, run the following commands to render the docs locally:

npm install
npm run gen-api-docs all
npm run build
npm run serve

You can open up the docs in your browser at http://localhost:3000

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks: