# What does this PR do? The external providers guide can now be accessed directly from the sidebar ## Test Plan Build locally to test the changes Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2.5 KiB
Providers Overview
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
- LLM inference providers (e.g., Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, etc.),
- Vector databases (e.g., ChromaDB, Weaviate, Qdrant, Milvus, FAISS, PGVector, SQLite-Vec, etc.),
- Safety providers (e.g., Meta's Llama Guard, AWS Bedrock Guardrails, etc.)
Providers come in two flavors:
- Remote: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.
- Inline: the provider is fully specified and implemented within the Llama Stack codebase. It may be a simple wrapper around an existing library, or a full fledged implementation within Llama Stack.
Importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.
External Providers
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently.
:maxdepth: 1
external
Agents
Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
:maxdepth: 1
agents/index
DatasetIO
Interfaces with datasets and data loaders.
:maxdepth: 1
datasetio/index
Eval
Generates outputs (via Inference or Agents) and perform scoring.
:maxdepth: 1
eval/index
Inference
Runs inference with an LLM.
:maxdepth: 1
inference/index
Post Training
Fine-tunes a model.
:maxdepth: 1
post_training/index
Safety
Applies safety policies to the output at a Systems (not only model) level.
:maxdepth: 1
safety/index
Scoring
Evaluates the outputs of the system.
:maxdepth: 1
scoring/index
Telemetry
Collects telemetry data from the system.
:maxdepth: 1
telemetry/index
Tool Runtime
Is associated with the ToolGroup resouces.
:maxdepth: 1
tool_runtime/index
Vector IO
Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents. Vector IO plays a crucial role in Retreival Augmented Generation (RAG), where the vector io and database are used to store and retrieve documents for retrieval.
:maxdepth: 1
vector_io/index