mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-19 11:20:03 +00:00
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s
Integration Tests / discover-tests (push) Successful in 2s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 14s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
Test Llama Stack Build / generate-matrix (push) Successful in 16s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Test External Providers / test-external-providers (venv) (push) Failing after 17s
Update ReadTheDocs / update-readthedocs (push) Failing after 15s
Test Llama Stack Build / build-single-provider (push) Failing after 21s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s
Python Package Build Test / build (3.13) (push) Failing after 44s
Test Llama Stack Build / build (push) Failing after 25s
Integration Tests / test-matrix (push) Failing after 46s
Pre-commit / pre-commit (push) Successful in 2m24s
# What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 **Preview:**  You can also build a full local preview locally **Feedback** Looking for feedback on page titles and general feedback on the new structure **Follow up documentation** I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585
92 lines
No EOL
2.4 KiB
Markdown
92 lines
No EOL
2.4 KiB
Markdown
# API Providers Overview
|
|
|
|
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
|
|
- LLM inference providers (e.g., Meta Reference, Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, OpenAI, Anthropic, Gemini, WatsonX, etc.),
|
|
- Vector databases (e.g., FAISS, SQLite-Vec, ChromaDB, Weaviate, Qdrant, Milvus, PGVector, etc.),
|
|
- Safety providers (e.g., Meta's Llama Guard, Prompt Guard, Code Scanner, AWS Bedrock Guardrails, etc.),
|
|
- Tool Runtime providers (e.g., RAG Runtime, Brave Search, etc.)
|
|
|
|
Providers come in two flavors:
|
|
- **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.
|
|
- **Inline**: the provider is fully specified and implemented within the Llama Stack codebase. It may be a simple wrapper around an existing library, or a full fledged implementation within Llama Stack.
|
|
|
|
Importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.
|
|
|
|
## External Providers
|
|
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
external.md
|
|
```
|
|
|
|
```{include} openai.md
|
|
:start-after: ## OpenAI API Compatibility
|
|
```
|
|
|
|
## Inference
|
|
Runs inference with an LLM.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
inference/index
|
|
```
|
|
|
|
## Agents
|
|
Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
agents/index
|
|
```
|
|
|
|
## DatasetIO
|
|
Interfaces with datasets and data loaders.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
datasetio/index
|
|
```
|
|
|
|
## Safety
|
|
Applies safety policies to the output at a Systems (not only model) level.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
safety/index
|
|
```
|
|
|
|
## Telemetry
|
|
Collects telemetry data from the system.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
telemetry/index
|
|
```
|
|
|
|
## Vector IO
|
|
|
|
Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents.
|
|
Vector IO plays a crucial role in [Retreival Augmented Generation (RAG)](../..//building_applications/rag), where the vector
|
|
io and database are used to store and retrieve documents for retrieval.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
vector_io/index
|
|
```
|
|
|
|
## Tool Runtime
|
|
Is associated with the ToolGroup resources.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
tool_runtime/index
|
|
``` |