mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-28 02:53:30 +00:00
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Integration Tests / test-matrix (http, inference) (push) Failing after 11s
Integration Tests / test-matrix (http, datasets) (push) Failing after 11s
Integration Tests / test-matrix (http, providers) (push) Failing after 10s
Integration Tests / test-matrix (http, inspect) (push) Failing after 12s
Integration Tests / test-matrix (http, agents) (push) Failing after 13s
Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (library, agents) (push) Failing after 10s
Integration Tests / test-matrix (http, scoring) (push) Failing after 11s
Integration Tests / test-matrix (http, post_training) (push) Failing after 11s
Integration Tests / test-matrix (library, datasets) (push) Failing after 10s
Integration Tests / test-matrix (library, inference) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 6s
Integration Tests / test-matrix (library, inspect) (push) Failing after 9s
Integration Tests / test-matrix (library, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s
Integration Tests / test-matrix (library, scoring) (push) Failing after 9s
Unit Tests / unit-tests (3.10) (push) Failing after 8s
Integration Tests / test-matrix (library, providers) (push) Failing after 10s
Unit Tests / unit-tests (3.11) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 1m18s
Pre-commit / pre-commit (push) Successful in 3m0s
# What does this PR do? the providers list is missing post_training. Add that column and `HuggingFace`, `TorchTune`, and `NVIDIA NEMO` as supported providers. also point to these providers in docs/source/providers/index.md, and describe basic functionality There are other missing provider types here as well, but starting with this Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
77 lines
2.5 KiB
Markdown
77 lines
2.5 KiB
Markdown
# Providers Overview
|
|
|
|
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
|
|
- LLM inference providers (e.g., Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, etc.),
|
|
- Vector databases (e.g., ChromaDB, Weaviate, Qdrant, Milvus, FAISS, PGVector, SQLite-Vec, etc.),
|
|
- Safety providers (e.g., Meta's Llama Guard, AWS Bedrock Guardrails, etc.)
|
|
|
|
Providers come in two flavors:
|
|
- **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.
|
|
- **Inline**: the provider is fully specified and implemented within the Llama Stack codebase. It may be a simple wrapper around an existing library, or a full fledged implementation within Llama Stack.
|
|
|
|
Importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.
|
|
|
|
## External Providers
|
|
|
|
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently. See the [External Providers Guide](external) for details.
|
|
|
|
## Agents
|
|
Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
|
|
|
|
## DatasetIO
|
|
Interfaces with datasets and data loaders.
|
|
|
|
## Eval
|
|
Generates outputs (via Inference or Agents) and perform scoring.
|
|
|
|
## Inference
|
|
Runs inference with an LLM.
|
|
|
|
## Post Training
|
|
Fine-tunes a model.
|
|
|
|
#### Post Training Providers
|
|
The following providers are available for Post Training:
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
external
|
|
post_training/huggingface
|
|
post_training/torchtune
|
|
post_training/nvidia_nemo
|
|
```
|
|
|
|
## Safety
|
|
Applies safety policies to the output at a Systems (not only model) level.
|
|
|
|
## Scoring
|
|
Evaluates the outputs of the system.
|
|
|
|
## Telemetry
|
|
Collects telemetry data from the system.
|
|
|
|
## Tool Runtime
|
|
Is associated with the ToolGroup resouces.
|
|
|
|
## Vector IO
|
|
|
|
Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents.
|
|
Vector IO plays a crucial role in [Retreival Augmented Generation (RAG)](../..//building_applications/rag), where the vector
|
|
io and database are used to store and retrieve documents for retrieval.
|
|
|
|
#### Vector IO Providers
|
|
The following providers (i.e., databases) are available for Vector IO:
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
external
|
|
vector_io/faiss
|
|
vector_io/sqlite-vec
|
|
vector_io/chromadb
|
|
vector_io/pgvector
|
|
vector_io/qdrant
|
|
vector_io/milvus
|
|
vector_io/weaviate
|
|
```
|