docs: fix broken links (#3540)

# What does this PR do?

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

- Fixes broken links and Docusaurus search

Closes #3518

## Test Plan

The following should produce a clean build with no warnings and search enabled:

```
npm install
npm run gen-api-docs all
npm run build
npm run serve
```

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
This commit is contained in:
Alexey Rybak 2025-09-24 14:16:31 -07:00 committed by GitHub
parent 8537ada11b
commit 6101c8e015
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
52 changed files with 188 additions and 981 deletions

View file

@ -302,4 +302,4 @@ customizer_url: ${env.NVIDIA_CUSTOMIZER_URL:=http://nemo.test}
- Check out the [Building Applications - Fine-tuning](../building_applications/index.mdx) guide for application-level examples
- See the [Providers](../providers/post_training/index.mdx) section for detailed provider documentation
- Review the [API Reference](../api_reference/post_training.mdx) for complete API documentation
- Review the [API Reference](../advanced_apis/post_training.mdx) for complete API documentation

View file

@ -189,5 +189,5 @@ The Scoring API works closely with the [Evaluation](./evaluation.mdx) API to pro
- Check out the [Evaluation](./evaluation.mdx) guide for running complete evaluations
- See the [Building Applications - Evaluation](../building_applications/evals.mdx) guide for application examples
- Review the [Evaluation Reference](../references/evals_reference.mdx) for comprehensive scoring function usage
- Explore the [Evaluation Concepts](../concepts/evaluation_concepts.mdx) for detailed conceptual information
- Review the [Evaluation Reference](../references/evals_reference/) for comprehensive scoring function usage
- Explore the [Evaluation Concepts](../concepts/evaluation_concepts) for detailed conceptual information

View file

@ -8,7 +8,7 @@ sidebar_position: 7
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
This guide walks you through the process of evaluating an LLM application built using Llama Stack. For detailed API reference, check out the [Evaluation Reference](/docs/references/evals-reference) guide that covers the complete set of APIs and developer experience flow.
This guide walks you through the process of evaluating an LLM application built using Llama Stack. For detailed API reference, check out the [Evaluation Reference](../references/evals_reference/) guide that covers the complete set of APIs and developer experience flow.
:::tip[Interactive Examples]
Check out our [Colab notebook](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing) for working examples with evaluations, or try the [Getting Started notebook](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb).
@ -251,6 +251,6 @@ results = client.scoring.score(
- **[Agents](./agent)** - Building agents for evaluation
- **[Tools Integration](./tools)** - Using tools in evaluated agents
- **[Evaluation Reference](/docs/references/evals-reference)** - Complete API reference for evaluations
- **[Evaluation Reference](../references/evals_reference/)** - Complete API reference for evaluations
- **[Getting Started Notebook](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb)** - Interactive examples
- **[Evaluation Examples](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing)** - Additional evaluation scenarios

View file

@ -20,23 +20,23 @@ The best way to get started is to look at this comprehensive notebook which walk
Here are the key topics that will help you build effective AI applications:
### 🤖 **Agent Development**
- **[Agent Framework](./agent)** - Understand the components and design patterns of the Llama Stack agent framework
- **[Agent Execution Loop](./agent_execution_loop)** - How agents process information, make decisions, and execute actions
- **[Agents vs Responses API](./responses_vs_agents)** - Learn when to use each API for different use cases
- **[Agent Framework](./agent.mdx)** - Understand the components and design patterns of the Llama Stack agent framework
- **[Agent Execution Loop](./agent_execution_loop.mdx)** - How agents process information, make decisions, and execute actions
- **[Agents vs Responses API](./responses_vs_agents.mdx)** - Learn when to use each API for different use cases
### 📚 **Knowledge Integration**
- **[RAG (Retrieval-Augmented Generation)](./rag)** - Enhance your agents with external knowledge through retrieval mechanisms
- **[RAG (Retrieval-Augmented Generation)](./rag.mdx)** - Enhance your agents with external knowledge through retrieval mechanisms
### 🛠️ **Capabilities & Extensions**
- **[Tools](./tools)** - Extend your agents' capabilities by integrating with external tools and APIs
- **[Tools](./tools.mdx)** - Extend your agents' capabilities by integrating with external tools and APIs
### 📊 **Quality & Monitoring**
- **[Evaluations](./evals)** - Evaluate your agents' effectiveness and identify areas for improvement
- **[Telemetry](./telemetry)** - Monitor and analyze your agents' performance and behavior
- **[Safety](./safety)** - Implement guardrails and safety measures to ensure responsible AI behavior
- **[Evaluations](./evals.mdx)** - Evaluate your agents' effectiveness and identify areas for improvement
- **[Telemetry](./telemetry.mdx)** - Monitor and analyze your agents' performance and behavior
- **[Safety](./safety.mdx)** - Implement guardrails and safety measures to ensure responsible AI behavior
### 🎮 **Interactive Development**
- **[Playground](./playground)** - Interactive environment for testing and developing applications
- **[Playground](./playground.mdx)** - Interactive environment for testing and developing applications
## Application Patterns
@ -77,7 +77,7 @@ Build production-ready systems with:
## Related Resources
- **[Getting Started](/docs/getting-started/)** - Basic setup and concepts
- **[Getting Started](/docs/getting_started/quickstart)** - Basic setup and concepts
- **[Providers](/docs/providers/)** - Available AI service providers
- **[Distributions](/docs/distributions/)** - Pre-configured deployment packages
- **[API Reference](/docs/api/)** - Complete API documentation
- **[API Reference](/docs/api/llama-stack-specification)** - Complete API documentation

View file

@ -291,9 +291,9 @@ llama stack run meta-reference
## Related Resources
- **[Getting Started Guide](/docs/getting-started)** - Complete setup and introduction
- **[Getting Started Guide](../getting_started/quickstart)** - Complete setup and introduction
- **[Core Concepts](/docs/concepts)** - Understanding Llama Stack fundamentals
- **[Agents](./agent)** - Building intelligent agents
- **[RAG (Retrieval Augmented Generation)](./rag)** - Knowledge-enhanced applications
- **[Evaluations](./evals)** - Comprehensive evaluation framework
- **[API Reference](/docs/api-reference)** - Complete API documentation
- **[API Reference](/docs/api/llama-stack-specification)** - Complete API documentation

View file

@ -13,7 +13,7 @@ import TabItem from '@theme/TabItem';
Llama Stack (LLS) provides two different APIs for building AI applications with tool calling capabilities: the **Agents API** and the **OpenAI Responses API**. While both enable AI systems to use tools, and maintain full conversation history, they serve different use cases and have distinct characteristics.
:::note
**Note:** For simple and basic inferencing, you may want to use the [Chat Completions API](/docs/providers/openai-compatibility#chat-completions) directly, before progressing to Agents or Responses API.
**Note:** For simple and basic inferencing, you may want to use the [Chat Completions API](../providers/openai#chat-completions) directly, before progressing to Agents or Responses API.
:::
## Overview
@ -217,5 +217,5 @@ Use this framework to choose the right API for your use case:
- **[Agents](./agent)** - Understanding the Agents API fundamentals
- **[Agent Execution Loop](./agent_execution_loop)** - How agents process turns and steps
- **[Tools Integration](./tools)** - Adding capabilities to both APIs
- **[OpenAI Compatibility](/docs/providers/openai-compatibility)** - Using OpenAI-compatible endpoints
- **[OpenAI Compatibility](../providers/openai)** - Using OpenAI-compatible endpoints
- **[Safety Guardrails](./safety)** - Implementing safety measures in agents

View file

@ -2,7 +2,7 @@
title: External APIs
description: Understanding external APIs in Llama Stack
sidebar_label: External APIs
sidebar_position: 4
sidebar_position: 3
---
# External APIs

View file

@ -2,7 +2,7 @@
title: Distributions
description: Pre-packaged provider configurations for different deployment scenarios
sidebar_label: Distributions
sidebar_position: 5
sidebar_position: 3
---
# Distributions

View file

@ -0,0 +1,78 @@
---
title: Evaluation Concepts
description: Running evaluations on Llama Stack
sidebar_label: Evaluation Concepts
sidebar_position: 5
---
# Evaluation Concepts
The Llama Stack Evaluation flow allows you to run evaluations on your GenAI application datasets or pre-registered benchmarks.
We introduce a set of APIs in Llama Stack for supporting running evaluations of LLM applications:
- `/datasetio` + `/datasets` API
- `/scoring` + `/scoring_functions` API
- `/eval` + `/benchmarks` API
This guide goes over the sets of APIs and developer experience flow of using Llama Stack to run evaluations for different use cases. Checkout our Colab notebook on working examples with evaluations [here](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing).
The Evaluation APIs are associated with a set of Resources. Please visit the Resources section in our [Core Concepts](./index.mdx) guide for better high-level understanding.
- **DatasetIO**: defines interface with datasets and data loaders.
- Associated with `Dataset` resource.
- **Scoring**: evaluate outputs of the system.
- Associated with `ScoringFunction` resource. We provide a suite of out-of-the box scoring functions and also the ability for you to add custom evaluators. These scoring functions are the core part of defining an evaluation task to output evaluation metrics.
- **Eval**: generate outputs (via Inference or Agents) and perform scoring.
- Associated with `Benchmark` resource.
## Open-benchmark Eval
### List of open-benchmarks Llama Stack support
Llama stack pre-registers several popular open-benchmarks to easily evaluate model perfomance via CLI.
The list of open-benchmarks we currently support:
- [MMLU-COT](https://arxiv.org/abs/2009.03300) (Measuring Massive Multitask Language Understanding): Benchmark designed to comprehensively evaluate the breadth and depth of a model's academic and professional understanding
- [GPQA-COT](https://arxiv.org/abs/2311.12022) (A Graduate-Level Google-Proof Q&A Benchmark): A challenging benchmark of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry.
- [SimpleQA](https://openai.com/index/introducing-simpleqa/): Benchmark designed to access models to answer short, fact-seeking questions.
- [MMMU](https://arxiv.org/abs/2311.16502) (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI)]: Benchmark designed to evaluate multimodal models.
You can follow this [contributing guide](../references/evals_reference/#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
### Run evaluation on open-benchmarks via CLI
We have built-in functionality to run the supported open-benckmarks using llama-stack-client CLI
#### Spin up Llama Stack server
Spin up llama stack server with 'open-benchmark' template
```bash
llama stack run llama_stack/distributions/open-benchmark/run.yaml
```
#### Run eval CLI
There are 3 necessary inputs to run a benchmark eval
- `list of benchmark_ids`: The list of benchmark ids to run evaluation on
- `model-id`: The model id to evaluate on
- `output_dir`: Path to store the evaluate results
```bash
llama-stack-client eval run-benchmark <benchmark_id_1> <benchmark_id_2> ... \
--model_id <model id to evaluate on> \
--output_dir <directory to store the evaluate results>
```
You can run
```bash
llama-stack-client eval run-benchmark help
```
to see the description of all the flags that eval run-benchmark has
In the output log, you can find the file path that has your evaluation results. Open that file and you can see you aggregate
evaluation results over there.
## What's Next?
- Check out our Colab notebook on working examples with running benchmark evaluations [here](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb#scrollTo=mxLCsP4MvFqP).
- Check out our [Building Applications - Evaluation](../building_applications/evals.mdx) guide for more details on how to use the Evaluation APIs to evaluate your applications.
- Check out our [Evaluation Reference](../references/evals_reference/) for more details on the APIs.

View file

@ -1,4 +1,9 @@
# Core Concepts
---
title: Core Concepts
description: Understanding Llama Stack's service-oriented philosophy and key concepts
sidebar_label: Overview
sidebar_position: 1
---
Given Llama Stack's service-oriented philosophy, a few concepts and workflows arise which may not feel completely natural in the LLM landscape, especially if you are coming with a background in other frameworks.
@ -6,38 +11,21 @@ Given Llama Stack's service-oriented philosophy, a few concepts and workflows ar
This section covers the fundamental concepts of Llama Stack:
- **[Architecture](./architecture.md)** - Learn about Llama Stack's architectural design and principles
- **[APIs](./apis/index.mdx)** - Understanding the core APIs and their stability levels
- [API Overview](./apis/index.mdx) - Core APIs available in Llama Stack
- [API Providers](./apis/api_providers.mdx) - How providers implement APIs
- [API Stability Leveling](./apis/api_leveling.mdx) - API stability and versioning
- **[Distributions](./distributions.md)** - Pre-configured deployment packages
- **[Resources](./resources.md)** - Understanding Llama Stack resources and their lifecycle
- **[External Integration](./external.md)** - Integrating with external services and providers
- **[Architecture](architecture.mdx)** - Learn about Llama Stack's architectural design and principles
- **[APIs](/docs/concepts/apis/)** - Understanding the core APIs and their stability levels
- [API Overview](apis/index.mdx) - Core APIs available in Llama Stack
- [API Providers](apis/api_providers.mdx) - How providers implement APIs
- [External APIs](apis/external.mdx) - External APIs available in Llama Stack
- [API Stability Leveling](apis/api_leveling.mdx) - API stability and versioning
- **[Distributions](distributions.mdx)** - Pre-configured deployment packages
- **[Resources](resources.mdx)** - Understanding Llama Stack resources and their lifecycle
## Getting Started
If you're new to Llama Stack, we recommend starting with:
1. **[Architecture](./architecture.md)** - Understand the overall system design
2. **[APIs](./apis/index.mdx)** - Learn about the available APIs and their purpose
3. **[Distributions](./distributions.md)** - Choose a pre-configured setup for your use case
1. **[Architecture](architecture.mdx)** - Understand the overall system design
2. **[APIs](apis/index.mdx)** - Learn about the available APIs and their purpose
3. **[Distributions](distributions.mdx)** - Choose a pre-configured setup for your use case
Each concept builds upon the previous ones to give you a comprehensive understanding of how Llama Stack works and how to use it effectively.---
title: Core Concepts
description: Understanding Llama Stack's service-oriented philosophy and key concepts
sidebar_label: Overview
sidebar_position: 1
---
# Core Concepts
Given Llama Stack's service-oriented philosophy, a few concepts and workflows arise which may not feel completely natural in the LLM landscape, especially if you are coming with a background in other frameworks.
This section covers the key concepts you need to understand to work effectively with Llama Stack:
- **[Architecture](./architecture)** - Llama Stack's service-oriented design and benefits
- **[APIs](./apis)** - Available REST APIs and planned capabilities
- **[API Providers](./api_providers)** - Remote vs inline provider implementations
- **[Distributions](./distributions)** - Pre-packaged provider configurations
- **[Resources](./resources)** - Resource federation and registration
Each concept builds upon the previous ones to give you a comprehensive understanding of how Llama Stack works and how to use it effectively.

View file

@ -2,7 +2,7 @@
title: Resources
description: Resource federation and registration in Llama Stack
sidebar_label: Resources
sidebar_position: 6
sidebar_position: 4
---
# Resources

View file

@ -148,7 +148,7 @@ As a general guideline:
that describes the configuration. These descriptions will be used to generate the provider
documentation.
* When possible, use keyword arguments only when calling functions.
* Llama Stack utilizes [custom Exception classes](llama_stack/apis/common/errors.py) for certain Resources that should be used where applicable.
* Llama Stack utilizes custom Exception classes for certain Resources that should be used where applicable.
### License
By contributing to Llama, you agree that your contributions will be licensed
@ -212,35 +212,22 @@ The generated API schema will be available in `docs/static/`. Make sure to revie
## Adding a New Provider
See:
- [Adding a New API Provider Page](new_api_provider.md) which describes how to add new API providers to the Stack.
- [Vector Database Page](new_vector_database.md) which describes how to add a new vector databases with Llama Stack.
- [External Provider Page](../providers/external/index.md) which describes how to add external providers to the Stack.
- [Adding a New API Provider Page](./new_api_provider.mdx) which describes how to add new API providers to the Stack.
- [Vector Database Page](./new_vector_database.mdx) which describes how to add a new vector databases with Llama Stack.
- [External Provider Page](/docs/providers/external/) which describes how to add external providers to the Stack.
```{toctree}
:maxdepth: 1
:hidden:
new_api_provider
new_vector_database
```
## Testing
```{include} ../../../tests/README.md
```
See the [Testing README](https://github.com/meta-llama/llama-stack/blob/main/tests/README.md) for detailed testing information.
## Advanced Topics
For developers who need deeper understanding of the testing system internals:
```{toctree}
:maxdepth: 1
testing/record-replay
```
- [Record-Replay Testing](./testing/record-replay.mdx)
### Benchmarking
```{include} ../../../benchmarking/k8s-benchmark/README.md
```
See the [Benchmarking README](https://github.com/meta-llama/llama-stack/blob/main/benchmarking/k8s-benchmark/README.md) for benchmarking information.

View file

@ -11,7 +11,7 @@ import TabItem from '@theme/TabItem';
This guide will walk you through the process of adding a new API provider to Llama Stack.
- Begin by reviewing the [core concepts](../concepts/index.md) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
- Begin by reviewing the [core concepts](../concepts/) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
- Determine the provider type ([Remote](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/remote) or [Inline](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/inline)). Remote providers make requests to external services, while inline providers execute implementation locally.
- Add your provider to the appropriate [Registry](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/registry/). Specify pip dependencies necessary.
- Update any distribution [Templates](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/distributions/) `build.yaml` and `run.yaml` files if they should include your provider by default. Run [./scripts/distro_codegen.py](https://github.com/meta-llama/llama-stack/blob/main/scripts/distro_codegen.py) if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.

View file

@ -219,6 +219,6 @@ kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- curl http:
## Related Resources
- **[Deployment Overview](./index)** - Overview of deployment options
- **[Deployment Overview](/docs/deploying/)** - Overview of deployment options
- **[Distributions](/docs/distributions)** - Understanding Llama Stack distributions
- **[Configuration](/docs/distributions/configuration)** - Detailed configuration options

View file

@ -251,7 +251,7 @@ directory or a git repository (git must be installed on the build environment).
llama stack build --config my-external-stack.yaml
```
For more information on external providers, including directory structure, provider types, and implementation requirements, see the [External Providers documentation](../providers/external.md).
For more information on external providers, including directory structure, provider types, and implementation requirements, see the [External Providers documentation](../providers/external/).
</TabItem>
<TabItem value="container" label="Building Container">

View file

@ -206,7 +206,7 @@ models:
provider_model_id: null
model_type: llm
```
A Model is an instance of a "Resource" (see [Concepts](../concepts/index)) and is associated with a specific inference provider (in this case, the provider with identifier `ollama`). This is an instance of a "pre-registered" model. While we always encourage the clients to register models before using them, some Stack servers may come up a list of "already known and available" models.
A Model is an instance of a "Resource" (see [Concepts](../concepts/)) and is associated with a specific inference provider (in this case, the provider with identifier `ollama`). This is an instance of a "pre-registered" model. While we always encourage the clients to register models before using them, some Stack servers may come up a list of "already known and available" models.
What's with the `provider_model_id` field? This is an identifier for the model inside the provider's model catalog. Contrast it with `model_id` which is the identifier for the same model for Llama Stack's purposes. For example, you may want to name "llama3.2:vision-11b" as "image_captioning_model" when you use it in your Stack interactions. When omitted, the server will set `provider_model_id` to be the same as `model_id`.

View file

@ -33,7 +33,7 @@ Then, you can access the APIs like `models` and `inference` on the client and ca
response = client.models.list()
```
If you've created a [custom distribution](building_distro.md), you can also use the run.yaml configuration file directly:
If you've created a [custom distribution](./building_distro), you can also use the run.yaml configuration file directly:
```python
client = LlamaStackAsLibraryClient(config_path)

View file

@ -13,9 +13,9 @@ This section provides an overview of the distributions available in Llama Stack.
## Distribution Guides
- **[Available Distributions](./list_of_distributions)** - Complete list and comparison of all distributions
- **[Building Custom Distributions](./building_distro)** - Create your own distribution from scratch
- **[Customizing Configuration](./customizing_run_yaml)** - Customize run.yaml for your needs
- **[Starting Llama Stack Server](./starting_llama_stack_server)** - How to run distributions
- **[Importing as Library](./importing_as_library)** - Use distributions in your code
- **[Configuration Reference](./configuration)** - Configuration file format details
- **[Available Distributions](./list_of_distributions.mdx)** - Complete list and comparison of all distributions
- **[Building Custom Distributions](./building_distro.mdx)** - Create your own distribution from scratch
- **[Customizing Configuration](./customizing_run_yaml.mdx)** - Customize run.yaml for your needs
- **[Starting Llama Stack Server](./starting_llama_stack_server.mdx)** - How to run distributions
- **[Importing as Library](./importing_as_library.mdx)** - Use distributions in your code
- **[Configuration Reference](./configuration.mdx)** - Configuration file format details

View file

@ -62,7 +62,7 @@ docker pull llama-stack/distribution-meta-reference-gpu
**Partners:** [Fireworks.ai](https://fireworks.ai) and [Together.xyz](https://together.xyz)
**Guides:** [Remote-Hosted Endpoints](remote_hosted_distro/index)
**Guides:** [Remote-Hosted Endpoints](./remote_hosted_distro/)
### 📱 Mobile Development
@ -81,7 +81,7 @@ docker pull llama-stack/distribution-meta-reference-gpu
- You need custom configurations
- You want to optimize for your specific use case
**Guides:** [Building Custom Distributions](building_distro.md)
**Guides:** [Building Custom Distributions](./building_distro)
## Detailed Documentation
@ -131,4 +131,4 @@ graph TD
3. **Configure your providers** with API keys or local models
4. **Start building** with Llama Stack!
For help choosing or troubleshooting, check our [Getting Started Guide](../getting_started/index.md) or [Community Support](https://github.com/llama-stack/llama-stack/discussions).
For help choosing or troubleshooting, check our [Getting Started Guide](/docs/getting_started/quickstart) or [Community Support](https://github.com/llama-stack/llama-stack/discussions).

View file

@ -66,7 +66,7 @@ llama stack run starter --port 5050
Ensure the Llama Stack server version is the same as the Kotlin SDK Library for maximum compatibility.
Other inference providers: [Table](../../index.md#supported-llama-stack-implementations)
Other inference providers: [Table](/docs/)
How to set remote localhost in Demo App: [Settings](https://github.com/meta-llama/llama-stack-client-kotlin/tree/latest-release/examples/android_app#settings)

View file

@ -36,25 +36,25 @@ The starter distribution includes a comprehensive set of inference providers:
### Hosted Providers
- **[OpenAI](https://openai.com/api/)**: GPT-4, GPT-3.5, O1, O3, O4 models and text embeddings -
provider ID: `openai` - reference documentation: [openai](../../providers/inference/remote_openai.md)
provider ID: `openai` - reference documentation: [openai](../../providers/inference/remote_openai)
- **[Fireworks](https://fireworks.ai/)**: Llama 3.1, 3.2, 3.3, 4 Scout, 4 Maverick models and
embeddings - provider ID: `fireworks` - reference documentation: [fireworks](../../providers/inference/remote_fireworks.md)
embeddings - provider ID: `fireworks` - reference documentation: [fireworks](../../providers/inference/remote_fireworks)
- **[Together](https://together.ai/)**: Llama 3.1, 3.2, 3.3, 4 Scout, 4 Maverick models and
embeddings - provider ID: `together` - reference documentation: [together](../../providers/inference/remote_together.md)
- **[Anthropic](https://www.anthropic.com/)**: Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude 3.5 Haiku, and Voyage embeddings - provider ID: `anthropic` - reference documentation: [anthropic](../../providers/inference/remote_anthropic.md)
- **[Gemini](https://gemini.google.com/)**: Gemini 1.5, 2.0, 2.5 models and text embeddings - provider ID: `gemini` - reference documentation: [gemini](../../providers/inference/remote_gemini.md)
- **[Groq](https://groq.com/)**: Fast Llama models (3.1, 3.2, 3.3, 4 Scout, 4 Maverick) - provider ID: `groq` - reference documentation: [groq](../../providers/inference/remote_groq.md)
- **[SambaNova](https://www.sambanova.ai/)**: Llama 3.1, 3.2, 3.3, 4 Scout, 4 Maverick models - provider ID: `sambanova` - reference documentation: [sambanova](../../providers/inference/remote_sambanova.md)
- **[Cerebras](https://www.cerebras.ai/)**: Cerebras AI models - provider ID: `cerebras` - reference documentation: [cerebras](../../providers/inference/remote_cerebras.md)
- **[NVIDIA](https://www.nvidia.com/)**: NVIDIA NIM - provider ID: `nvidia` - reference documentation: [nvidia](../../providers/inference/remote_nvidia.md)
- **[HuggingFace](https://huggingface.co/)**: Serverless and endpoint models - provider ID: `hf::serverless` and `hf::endpoint` - reference documentation: [huggingface-serverless](../../providers/inference/remote_hf_serverless.md) and [huggingface-endpoint](../../providers/inference/remote_hf_endpoint.md)
- **[Bedrock](https://aws.amazon.com/bedrock/)**: AWS Bedrock models - provider ID: `bedrock` - reference documentation: [bedrock](../../providers/inference/remote_bedrock.md)
embeddings - provider ID: `together` - reference documentation: [together](../../providers/inference/remote_together)
- **[Anthropic](https://www.anthropic.com/)**: Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude 3.5 Haiku, and Voyage embeddings - provider ID: `anthropic` - reference documentation: [anthropic](../../providers/inference/remote_anthropic)
- **[Gemini](https://gemini.google.com/)**: Gemini 1.5, 2.0, 2.5 models and text embeddings - provider ID: `gemini` - reference documentation: [gemini](../../providers/inference/remote_gemini)
- **[Groq](https://groq.com/)**: Fast Llama models (3.1, 3.2, 3.3, 4 Scout, 4 Maverick) - provider ID: `groq` - reference documentation: [groq](../../providers/inference/remote_groq)
- **[SambaNova](https://www.sambanova.ai/)**: Llama 3.1, 3.2, 3.3, 4 Scout, 4 Maverick models - provider ID: `sambanova` - reference documentation: [sambanova](../../providers/inference/remote_sambanova)
- **[Cerebras](https://www.cerebras.ai/)**: Cerebras AI models - provider ID: `cerebras` - reference documentation: [cerebras](../../providers/inference/remote_cerebras)
- **[NVIDIA](https://www.nvidia.com/)**: NVIDIA NIM - provider ID: `nvidia` - reference documentation: [nvidia](../../providers/inference/remote_nvidia)
- **[HuggingFace](https://huggingface.co/)**: Serverless and endpoint models - provider ID: `hf::serverless` and `hf::endpoint` - reference documentation: [huggingface-serverless](../../providers/inference/remote_hf_serverless) and [huggingface-endpoint](../../providers/inference/remote_hf_endpoint)
- **[Bedrock](https://aws.amazon.com/bedrock/)**: AWS Bedrock models - provider ID: `bedrock` - reference documentation: [bedrock](../../providers/inference/remote_bedrock)
### Local/Remote Providers
- **[Ollama](https://ollama.ai/)**: Local Ollama models - provider ID: `ollama` - reference documentation: [ollama](../../providers/inference/remote_ollama.md)
- **[vLLM](https://docs.vllm.ai/en/latest/)**: Local or remote vLLM server - provider ID: `vllm` - reference documentation: [vllm](../../providers/inference/remote_vllm.md)
- **[TGI](https://github.com/huggingface/text-generation-inference)**: Text Generation Inference server - Dell Enterprise Hub's custom TGI container too (use `DEH_URL`) - provider ID: `tgi` - reference documentation: [tgi](../../providers/inference/remote_tgi.md)
- **[Sentence Transformers](https://www.sbert.net/)**: Local embedding models - provider ID: `sentence-transformers` - reference documentation: [sentence-transformers](../../providers/inference/inline_sentence-transformers.md)
- **[Ollama](https://ollama.ai/)**: Local Ollama models - provider ID: `ollama` - reference documentation: [ollama](../../providers/inference/remote_ollama)
- **[vLLM](https://docs.vllm.ai/en/latest/)**: Local or remote vLLM server - provider ID: `vllm` - reference documentation: [vllm](../../providers/inference/remote_vllm)
- **[TGI](https://github.com/huggingface/text-generation-inference)**: Text Generation Inference server - Dell Enterprise Hub's custom TGI container too (use `DEH_URL`) - provider ID: `tgi` - reference documentation: [tgi](../../providers/inference/remote_tgi)
- **[Sentence Transformers](https://www.sbert.net/)**: Local embedding models - provider ID: `sentence-transformers` - reference documentation: [sentence-transformers](../../providers/inference/inline_sentence-transformers)
All providers are disabled by default. So you need to enable them by setting the environment variables.

View file

@ -16,11 +16,11 @@ This is the simplest way to get started. Using Llama Stack as a library means yo
## Container:
Another simple way to start interacting with Llama Stack is to just spin up a container (via Docker or Podman) which is pre-built with all the providers you need. We provide a number of pre-built images so you can start a Llama Stack server instantly. You can also build your own custom container. Which distribution to choose depends on the hardware you have. See [Selection of a Distribution](selection) for more details.
Another simple way to start interacting with Llama Stack is to just spin up a container (via Docker or Podman) which is pre-built with all the providers you need. We provide a number of pre-built images so you can start a Llama Stack server instantly. You can also build your own custom container. Which distribution to choose depends on the hardware you have. See [Selection of a Distribution](./list_of_distributions) for more details.
## Kubernetes:
If you have built a container image and want to deploy it in a Kubernetes cluster instead of starting the Llama Stack server locally. See [Kubernetes Deployment Guide](kubernetes_deployment) for more details.
If you have built a container image and want to deploy it in a Kubernetes cluster instead of starting the Llama Stack server locally. See [Kubernetes Deployment Guide](../deploying/kubernetes_deployment) for more details.
```{toctree}

View file

@ -18,7 +18,7 @@ In Llama Stack, we provide a server exposing multiple APIs. These APIs are backe
Llama Stack is a stateful service with REST APIs to support seamless transition of AI applications across different environments. The server can be run in a variety of ways, including as a standalone binary, Docker container, or hosted service. You can build and test using a local server first and deploy to a hosted endpoint for production.
In this guide, we'll walk through how to build a RAG agent locally using Llama Stack with [Ollama](https://ollama.com/)
as the inference [provider](../providers/index.md#inference) for a Llama Model.
as the inference [provider](/docs/providers/inference/) for a Llama Model.
### Step 1: Installation and Setup
@ -60,8 +60,8 @@ Llama Stack is a server that exposes multiple APIs, you connect with it using th
<TabItem value="venv" label="Using venv">
You can use Python to build and run the Llama Stack server, which is useful for testing and development.
Llama Stack uses a [YAML configuration file](../distributions/configuration.md) to specify the stack setup,
which defines the providers and their settings. The generated configuration serves as a starting point that you can [customize for your specific needs](../distributions/customizing_run_yaml.md).
Llama Stack uses a [YAML configuration file](../distributions/configuration) to specify the stack setup,
which defines the providers and their settings. The generated configuration serves as a starting point that you can [customize for your specific needs](../distributions/customizing_run_yaml).
Now let's build and run the Llama Stack config for Ollama.
We use `starter` as template. By default all providers are disabled, this requires enable ollama by passing environment variables.
@ -73,7 +73,7 @@ llama stack build --distro starter --image-type venv --run
You can use a container image to run the Llama Stack server. We provide several container images for the server
component that works with different inference providers out of the box. For this guide, we will use
`llamastack/distribution-starter` as the container image. If you'd like to build your own image or customize the
configurations, please check out [this guide](../distributions/building_distro.md).
configurations, please check out [this guide](../distributions/building_distro).
First lets setup some environment variables and create a local directory to mount into the containers file system.
```bash
export LLAMA_STACK_PORT=8321
@ -145,7 +145,7 @@ pip install llama-stack-client
</TabItem>
</Tabs>
Now let's use the `llama-stack-client` [CLI](../references/llama_stack_client_cli_reference.md) to check the
Now let's use the `llama-stack-client` [CLI](../references/llama_stack_client_cli_reference) to check the
connectivity to the server.
```bash
@ -216,8 +216,8 @@ OpenAIChatCompletion(
### Step 4: Run the Demos
Note that these demos show the [Python Client SDK](../references/python_sdk_reference/index.md).
Other SDKs are also available, please refer to the [Client SDK](../index.md#client-sdks) list for the complete options.
Note that these demos show the [Python Client SDK](../references/python_sdk_reference/).
Other SDKs are also available, please refer to the [Client SDK](/docs/) list for the complete options.
<Tabs>
<TabItem value="inference" label="Basic Inference">
@ -538,4 +538,4 @@ uv run python rag_agent.py
**You're Ready to Build Your Own Apps!**
Congrats! 🥳 Now you're ready to [build your own Llama Stack applications](../building_applications/index)! 🚀
Congrats! 🥳 Now you're ready to [build your own Llama Stack applications](../building_applications/)! 🚀

View file

@ -140,7 +140,7 @@ If you are getting a **401 Client Error** from HuggingFace for the **all-MiniLM-
### Next Steps
Now you're ready to dive deeper into Llama Stack!
- Explore the [Detailed Tutorial](/docs/detailed_tutorial).
- Explore the [Detailed Tutorial](./detailed_tutorial).
- Try the [Getting Started Notebook](https://github.com/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb).
- Browse more [Notebooks on GitHub](https://github.com/meta-llama/llama-stack/tree/main/docs/notebooks).
- Learn about Llama Stack [Concepts](/docs/concepts).

View file

@ -25,7 +25,3 @@ Agents API for creating and interacting with agentic systems.
- Agents can also use Memory to retrieve information from knowledge bases. See the RAG Tool and Vector IO APIs for more details.
This section contains documentation for all available providers for the **agents** API.
## Providers
- [Meta-Reference](./inline_meta-reference)

View file

@ -29,7 +29,3 @@ The Batches API enables efficient processing of multiple requests in a single op
Note: This API is currently under active development and may undergo changes.
This section contains documentation for all available providers for the **batches** API.
## Providers
- [Reference](./inline_reference)

View file

@ -1,16 +0,0 @@
---
sidebar_label: Datasetio
title: Datasetio
---
# Datasetio
## Overview
This section contains documentation for all available providers for the **datasetio** API.
## Providers
- [Localfs](./inline_localfs)
- [Remote - Huggingface](./remote_huggingface)
- [Remote - Nvidia](./remote_nvidia)

View file

@ -8,9 +8,3 @@ title: Datasetio
## Overview
This section contains documentation for all available providers for the **datasetio** API.
## Providers
- [Localfs](./inline_localfs)
- [Remote - Huggingface](./remote_huggingface)
- [Remote - Nvidia](./remote_nvidia)

View file

@ -11,8 +11,3 @@ title: Eval
Llama Stack Evaluation API for running evaluations on model and agent candidates.
This section contains documentation for all available providers for the **eval** API.
## Providers
- [Meta-Reference](./inline_meta-reference)
- [Remote - Nvidia](./remote_nvidia)

View file

@ -7,5 +7,5 @@ Llama Stack supports external providers that live outside of the main codebase.
## External Provider Documentation
- [Known External Providers](external-providers-list)
- [Creating External Providers](external-providers-guide)
- [Known External Providers](./external-providers-list.mdx)
- [Creating External Providers](./external-providers-guide.mdx)

View file

@ -8,8 +8,3 @@ title: Files
## Overview
This section contains documentation for all available providers for the **files** API.
## Providers
- [Localfs](./inline_localfs)
- [Remote - S3](./remote_s3)

View file

@ -21,13 +21,13 @@ Importantly, Llama Stack always strives to provide at least one fully inline pro
## Provider Categories
- **[External Providers](./external/)** - Guide for building and using external providers
- **[OpenAI Compatibility](./openai)** - OpenAI API compatibility layer
- **[Inference](./inference/)** - LLM and embedding model providers
- **[Agents](./agents/)** - Agentic system providers
- **[DatasetIO](./datasetio/)** - Dataset and data loader providers
- **[Safety](./safety/)** - Content moderation and safety providers
- **[Telemetry](./telemetry/)** - Monitoring and observability providers
- **[Vector IO](./vector-io/)** - Vector database providers
- **[Tool Runtime](./tool-runtime/)** - Tool and protocol providers
- **[Files](./files/)** - File system and storage providers
- **[External Providers](external/index.mdx)** - Guide for building and using external providers
- **[OpenAI Compatibility](./openai.mdx)** - OpenAI API compatibility layer
- **[Inference](inference/index.mdx)** - LLM and embedding model providers
- **[Agents](agents/index.mdx)** - Agentic system providers
- **[DatasetIO](datasetio/index.mdx)** - Dataset and data loader providers
- **[Safety](safety/index.mdx)** - Content moderation and safety providers
- **[Telemetry](telemetry/index.mdx)** - Monitoring and observability providers
- **[Vector IO](vector_io/index.mdx)** - Vector database providers
- **[Tool Runtime](tool_runtime/index.mdx)** - Tool and protocol providers
- **[Files](files/index.mdx)** - File system and storage providers

View file

@ -19,30 +19,3 @@ Llama Stack Inference API for generating completions, chat completions, and embe
- Embedding models: these models generate embeddings to be used for semantic search.
This section contains documentation for all available providers for the **inference** API.
## Providers
- [Meta-Reference](./inline_meta-reference)
- [Sentence-Transformers](./inline_sentence-transformers)
- [Remote - Anthropic](./remote_anthropic)
- [Remote - Azure](./remote_azure)
- [Remote - Bedrock](./remote_bedrock)
- [Remote - Cerebras](./remote_cerebras)
- [Remote - Databricks](./remote_databricks)
- [Remote - Fireworks](./remote_fireworks)
- [Remote - Gemini](./remote_gemini)
- [Remote - Groq](./remote_groq)
- [Remote - Hf - Endpoint](./remote_hf_endpoint)
- [Remote - Hf - Serverless](./remote_hf_serverless)
- [Remote - Llama-Openai-Compat](./remote_llama-openai-compat)
- [Remote - Nvidia](./remote_nvidia)
- [Remote - Ollama](./remote_ollama)
- [Remote - Openai](./remote_openai)
- [Remote - Passthrough](./remote_passthrough)
- [Remote - Runpod](./remote_runpod)
- [Remote - Sambanova](./remote_sambanova)
- [Remote - Tgi](./remote_tgi)
- [Remote - Together](./remote_together)
- [Remote - Vertexai](./remote_vertexai)
- [Remote - Vllm](./remote_vllm)
- [Remote - Watsonx](./remote_watsonx)

View file

@ -1,3 +1,8 @@
title: OpenAI Compatibility
description: OpenAI API Compatibility
sidebar_label: OpenAI Compatibility
sidebar_position: 1
---
## OpenAI API Compatibility
### Server path

View file

@ -8,10 +8,3 @@ title: Post_Training
## Overview
This section contains documentation for all available providers for the **post_training** API.
## Providers
- [Huggingface-Gpu](./inline_huggingface-gpu)
- [Torchtune-Cpu](./inline_torchtune-cpu)
- [Torchtune-Gpu](./inline_torchtune-gpu)
- [Remote - Nvidia](./remote_nvidia)

View file

@ -8,12 +8,3 @@ title: Safety
## Overview
This section contains documentation for all available providers for the **safety** API.
## Providers
- [Code-Scanner](./inline_code-scanner)
- [Llama-Guard](./inline_llama-guard)
- [Prompt-Guard](./inline_prompt-guard)
- [Remote - Bedrock](./remote_bedrock)
- [Remote - Nvidia](./remote_nvidia)
- [Remote - Sambanova](./remote_sambanova)

View file

@ -8,9 +8,3 @@ title: Scoring
## Overview
This section contains documentation for all available providers for the **scoring** API.
## Providers
- [Basic](./inline_basic)
- [Braintrust](./inline_braintrust)
- [Llm-As-Judge](./inline_llm-as-judge)

View file

@ -8,7 +8,3 @@ title: Telemetry
## Overview
This section contains documentation for all available providers for the **telemetry** API.
## Providers
- [Meta-Reference](./inline_meta-reference)

View file

@ -8,12 +8,3 @@ title: Tool_Runtime
## Overview
This section contains documentation for all available providers for the **tool_runtime** API.
## Providers
- [Rag-Runtime](./inline_rag-runtime)
- [Remote - Bing-Search](./remote_bing-search)
- [Remote - Brave-Search](./remote_brave-search)
- [Remote - Model-Context-Protocol](./remote_model-context-protocol)
- [Remote - Tavily-Search](./remote_tavily-search)
- [Remote - Wolfram-Alpha](./remote_wolfram-alpha)

View file

@ -8,18 +8,3 @@ title: Vector_Io
## Overview
This section contains documentation for all available providers for the **vector_io** API.
## Providers
- [Chromadb](./inline_chromadb)
- [Faiss](./inline_faiss)
- [Meta-Reference](./inline_meta-reference)
- [Milvus](./inline_milvus)
- [Qdrant](./inline_qdrant)
- [Sqlite-Vec](./inline_sqlite-vec)
- [Sqlite Vec](./inline_sqlite_vec)
- [Remote - Chromadb](./remote_chromadb)
- [Remote - Milvus](./remote_milvus)
- [Remote - Pgvector](./remote_pgvector)
- [Remote - Qdrant](./remote_qdrant)
- [Remote - Weaviate](./remote_weaviate)

View file

@ -7,6 +7,6 @@ sidebar_position: 1
# References
- [Python SDK Reference](python_sdk_reference/index)
- [Llama CLI](llama_cli_reference/index) for building and running your Llama Stack server
- [Llama Stack Client CLI](llama_stack_client_cli_reference) for interacting with your Llama Stack server
- [Python SDK Reference](/docs/references/python_sdk_reference/)
- [Llama CLI](/docs/references/llama_cli_reference/) for building and running your Llama Stack server
- [Llama Stack Client CLI](./llama_stack_client_cli_reference.md) for interacting with your Llama Stack server

View file

@ -29,7 +29,7 @@ You have two ways to install Llama Stack:
## `llama` subcommands
1. `download`: Supports downloading models from Meta or Hugging Face. [Downloading models](#downloading-models)
2. `model`: Lists available models and their properties. [Understanding models](#understand-the-models)
3. `stack`: Allows you to build a stack using the `llama stack` distribution and run a Llama Stack server. You can read more about how to build a Llama Stack distribution in the [Build your own Distribution](../../distributions/building_distro) documentation.
3. `stack`: Allows you to build a stack using the `llama stack` distribution and run a Llama Stack server. You can read more about how to build a Llama Stack distribution in the [Build your own Distribution](../distributions/building_distro) documentation.
### Sample Usage

View file

@ -217,11 +217,6 @@ const config: Config = {
ignoreFiles: [
"node_modules/**/*",
],
// Exclude OpenAPI generated docs from search to avoid duplicates
searchContextByPaths: [
"docs",
],
},
],
],

BIN
docs/static/img/llama-stack-logo.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 70 KiB

View file

@ -1,64 +0,0 @@
import React from 'react';
import clsx from 'clsx';
import styles from './styles.module.css';
const FeatureList = [
{
title: 'Easy to Use',
Svg: require('@site/static/img/undraw_docusaurus_mountain.svg').default,
description: (
<>
Docusaurus was designed from the ground up to be easily installed and
used to get your website up and running quickly.
</>
),
},
{
title: 'Focus on What Matters',
Svg: require('@site/static/img/undraw_docusaurus_tree.svg').default,
description: (
<>
Docusaurus lets you focus on your docs, and we&apos;ll do the chores. Go
ahead and move your docs into the <code>docs</code> directory.
</>
),
},
{
title: 'Powered by React',
Svg: require('@site/static/img/undraw_docusaurus_react.svg').default,
description: (
<>
Extend or customize your website layout by reusing React. Docusaurus can
be extended while reusing the same header and footer.
</>
),
},
];
function Feature({Svg, title, description}) {
return (
<div className={clsx('col col--4')}>
<div className="text--center">
<Svg className={styles.featureSvg} role="img" />
</div>
<div className="text--center padding-horiz--md">
<h3>{title}</h3>
<p>{description}</p>
</div>
</div>
);
}
export default function HomepageFeatures() {
return (
<section className={styles.features}>
<div className="container">
<div className="row">
{FeatureList.map((props, idx) => (
<Feature key={idx} {...props} />
))}
</div>
</div>
</section>
);
}

View file

@ -1,11 +0,0 @@
.features {
display: flex;
align-items: center;
padding: 2rem 0;
width: 100%;
}
.featureSvg {
height: 200px;
width: 200px;
}

View file

@ -1,191 +0,0 @@
/**
* Any CSS included here will be global. The classic template
* bundles Infima by default. Infima is a CSS framework designed to
* work well for content-centric websites.
*/
/* You can override the default Infima variables here. */
:root {
/* Llama Stack Original Theme - Based on llamastack.github.io */
--ifm-color-primary: #4a4a68;
--ifm-color-primary-dark: #3a3a52;
--ifm-color-primary-darker: #332735;
--ifm-color-primary-darkest: #2b2129;
--ifm-color-primary-light: #5a5a7e;
--ifm-color-primary-lighter: #6a6a94;
--ifm-color-primary-lightest: #8080aa;
/* Additional theme colors */
--ifm-color-secondary: #1b263c;
--ifm-color-info: #2980b9;
--ifm-color-success: #16a085;
--ifm-color-warning: #f39c12;
--ifm-color-danger: #e74c3c;
/* Background colors */
--ifm-background-color: #ffffff;
--ifm-background-surface-color: #f8f9fa;
/* Code and syntax highlighting */
--ifm-code-font-size: 95%;
--ifm-pre-background: #1b263c;
--ifm-pre-color: #e1e5e9;
--docusaurus-highlighted-code-line-bg: rgba(51, 39, 53, 0.1);
/* Link colors */
--ifm-link-color: var(--ifm-color-primary);
--ifm-link-hover-color: var(--ifm-color-primary-darker);
/* Navbar */
--ifm-navbar-background-color: rgba(255, 255, 255, 0.95);
--ifm-navbar-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
/* Hero section gradient - matching original theme */
--hero-gradient: linear-gradient(90deg, #332735 0%, #1b263c 100%);
/* OpenAPI method colors */
--openapi-code-blue: #2980b9;
--openapi-code-green: #16a085;
--openapi-code-orange: #f39c12;
--openapi-code-red: #e74c3c;
--openapi-code-purple: #332735;
}
/* For readability concerns, you should choose a lighter palette in dark mode. */
[data-theme='dark'] {
/* Dark theme primary colors - lighter versions of original theme */
--ifm-color-primary: #8080aa;
--ifm-color-primary-dark: #6a6a94;
--ifm-color-primary-darker: #5a5a7e;
--ifm-color-primary-darkest: #4a4a68;
--ifm-color-primary-light: #9090ba;
--ifm-color-primary-lighter: #a0a0ca;
--ifm-color-primary-lightest: #b0b0da;
/* Dark theme background colors */
--ifm-background-color: #1a1a1a;
--ifm-background-surface-color: #2a2a2a;
/* Dark theme navbar */
--ifm-navbar-background-color: rgba(26, 26, 26, 0.95);
/* Dark theme code highlighting */
--docusaurus-highlighted-code-line-bg: rgba(51, 39, 53, 0.3);
/* Dark theme text colors */
--ifm-font-color-base: #e1e5e9;
--ifm-font-color-secondary: #a0a6ac;
}
/* Sidebar Method labels */
.api-method>.menu__link {
align-items: center;
justify-content: start;
}
.api-method>.menu__link::before {
width: 50px;
height: 20px;
font-size: 12px;
line-height: 20px;
text-transform: uppercase;
font-weight: 600;
border-radius: 0.25rem;
border: 1px solid;
margin-right: var(--ifm-spacing-horizontal);
text-align: center;
flex-shrink: 0;
border-color: transparent;
color: white;
}
.get>.menu__link::before {
content: "get";
background-color: var(--ifm-color-primary);
}
.put>.menu__link::before {
content: "put";
background-color: var(--openapi-code-blue);
}
.post>.menu__link::before {
content: "post";
background-color: var(--openapi-code-green);
}
.delete>.menu__link::before {
content: "del";
background-color: var(--openapi-code-red);
}
.patch>.menu__link::before {
content: "patch";
background-color: var(--openapi-code-orange);
}
.footer--dark {
--ifm-footer-link-color: #ffffff;
--ifm-footer-title-color: #ffffff;
}
.footer--dark .footer__link-item {
color: #ffffff;
}
.footer--dark .footer__title {
color: #ffffff;
}
/* OpenAPI theme fixes for light mode readability */
/* Version badge fixes */
.openapi__version-badge,
.theme-doc-version-badge,
[class*="version-badge"],
[class*="versionBadge"] {
background-color: #ffffff !important;
color: #333333 !important;
border: 1px solid #d1d5db !important;
}
/* OpenAPI method badges in light mode */
.openapi__method-badge,
[class*="method-badge"] {
color: #ffffff !important;
}
/* Button fixes for light mode */
.openapi__button,
.theme-api-docs-demo-panel button,
[class*="api-docs"] button,
button[class*="button"],
.openapi-explorer__response-schema button,
.openapi-tabs__operation button {
color: #ffffff !important;
}
.openapi__button:hover,
.theme-api-docs-demo-panel button:hover,
[class*="api-docs"] button:hover,
button[class*="button"]:hover,
.openapi-explorer__response-schema button:hover,
.openapi-tabs__operation button:hover {
color: #ffffff !important;
}
/* Navigation buttons (Next/Previous) */
.pagination-nav__link,
.pagination-nav__label {
color: #333333 !important;
}
.pagination-nav__link--next,
.pagination-nav__link--prev {
background-color: #ffffff !important;
border: 1px solid #d1d5db !important;
}
.pagination-nav__link--next:hover,
.pagination-nav__link--prev:hover {
background-color: #f3f4f6 !important;
}

View file

@ -1,163 +0,0 @@
import React from 'react';
import clsx from 'clsx';
import Layout from '@theme/Layout';
import Link from '@docusaurus/Link';
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import styles from './index.module.css';
function HomepageHeader() {
const {siteConfig} = useDocusaurusContext();
return (
<header className={clsx('hero hero--primary', styles.heroBanner)}>
<div className="container">
<div className={styles.heroContent}>
<h1 className={styles.heroTitle}>Build AI Applications with Llama Stack</h1>
<p className={styles.heroSubtitle}>
Unified APIs for Inference, RAG, Agents, Tools, Safety, and Telemetry
</p>
<div className={styles.buttons}>
<Link
className={clsx('button button--primary button--lg', styles.getStartedButton)}
to="/docs/getting-started">
🚀 Get Started
</Link>
<Link
className={clsx('button button--primary button--lg', styles.apiButton)}
to="/docs/category/llama-stack-api">
📚 API Reference
</Link>
</div>
</div>
</div>
</header>
);
}
function QuickStart() {
return (
<section className={styles.quickStart}>
<div className="container">
<div className="row">
<div className="col col--6">
<h2 className={styles.sectionTitle}>Quick Start</h2>
<p className={styles.sectionDescription}>
Get up and running with Llama Stack in just a few commands. Build your first RAG application locally.
</p>
<div className={styles.codeBlock}>
<pre><code>{`# Install uv and start Ollama
ollama run llama3.2:3b --keepalive 60m
# Run Llama Stack server
OLLAMA_URL=http://localhost:11434 \\
uv run --with llama-stack \\
llama stack build --distro starter \\
--image-type venv --run
# Try the Python SDK
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(
base_url="http://localhost:8321"
)
response = client.inference.chat_completion(
model="Llama3.2-3B-Instruct",
messages=[{
"role": "user",
"content": "What is machine learning?"
}]
)`}</code></pre>
</div>
</div>
<div className="col col--6">
<h2 className={styles.sectionTitle}>Why Llama Stack?</h2>
<div className={styles.features}>
<div className={styles.feature}>
<div className={styles.featureIcon}>🔗</div>
<div>
<h4>Unified APIs</h4>
<p>One consistent interface for all your AI needs - inference, safety, agents, and more.</p>
</div>
</div>
<div className={styles.feature}>
<div className={styles.featureIcon}>🔄</div>
<div>
<h4>Provider Flexibility</h4>
<p>Swap between providers without code changes. Start local, deploy anywhere.</p>
</div>
</div>
<div className={styles.feature}>
<div className={styles.featureIcon}>🛡</div>
<div>
<h4>Production Ready</h4>
<p>Built-in safety, monitoring, and evaluation tools for enterprise applications.</p>
</div>
</div>
<div className={styles.feature}>
<div className={styles.featureIcon}>📱</div>
<div>
<h4>Multi-Platform</h4>
<p>SDKs for Python, Node.js, iOS, Android, and REST APIs for any language.</p>
</div>
</div>
</div>
</div>
</div>
</div>
</section>
);
}
function CommunityLinks() {
return (
<section className={styles.community}>
<div className="container">
<div className={styles.communityContent}>
<h2 className={styles.sectionTitle}>Join the Community</h2>
<p className={styles.sectionDescription}>
Connect with developers building the future of AI applications
</p>
<div className={styles.communityLinks}>
<a
href="https://github.com/llamastack/llama-stack"
className={clsx('button button--outline button--lg', styles.communityButton)}
target="_blank"
rel="noopener noreferrer">
<span className={styles.communityIcon}></span>
Star on GitHub
</a>
<a
href="https://discord.gg/llama-stack"
className={clsx('button button--outline button--lg', styles.communityButton)}
target="_blank"
rel="noopener noreferrer">
<span className={styles.communityIcon}>💬</span>
Join Discord
</a>
<Link
to="/docs/intro"
className={clsx('button button--outline button--lg', styles.communityButton)}>
<span className={styles.communityIcon}>📚</span>
Read Docs
</Link>
</div>
</div>
</div>
</section>
);
}
export default function Home() {
const {siteConfig} = useDocusaurusContext();
return (
<Layout
title="Build AI Applications"
description="The open-source framework for building generative AI applications with unified APIs for Inference, RAG, Agents, Tools, Safety, and Telemetry.">
<HomepageHeader />
<main>
<QuickStart />
<CommunityLinks />
</main>
</Layout>
);
}

View file

@ -1,283 +0,0 @@
/**
* CSS files with the .module.css suffix will be treated as CSS modules
* and scoped locally.
*/
.heroBanner {
padding: 4rem 0;
text-align: center;
position: relative;
overflow: hidden;
background: var(--hero-gradient);
color: white;
display: flex;
align-items: center;
}
.heroBanner::before {
content: '';
position: absolute;
top: 0;
left: 0;
right: 0;
bottom: 0;
background: radial-gradient(circle at 30% 20%, rgba(255, 255, 255, 0.1) 0%, transparent 50%),
radial-gradient(circle at 70% 80%, rgba(255, 255, 255, 0.05) 0%, transparent 50%);
pointer-events: none;
}
.heroContent {
max-width: 800px;
margin: 0 auto;
}
.heroLogo {
height: 48px;
width: auto;
margin-bottom: 1.5rem;
}
.heroTitle {
font-size: 2.8rem;
font-weight: 700;
margin-bottom: 1rem;
line-height: 1.2;
}
.heroSubtitle {
font-size: 1.1rem;
font-weight: 400;
margin-bottom: 2rem;
opacity: 0.9;
line-height: 1.5;
max-width: 600px;
margin-left: auto;
margin-right: auto;
}
.buttons {
display: flex;
align-items: center;
justify-content: center;
gap: 1rem;
}
.heroBanner .getStartedButton {
background: white;
color: #332735;
border: 2px solid white;
font-weight: 600;
transition: all 0.3s ease;
}
.heroBanner .getStartedButton:hover {
background: rgba(255, 255, 255, 0.9);
color: #2b2129;
border-color: rgba(255, 255, 255, 0.9);
transform: translateY(-2px);
box-shadow: 0 8px 25px rgba(0, 0, 0, 0.15);
}
.heroBanner .apiButton {
background: transparent;
color: white;
border: 2px solid white;
font-weight: 600;
transition: all 0.3s ease;
}
.heroBanner .apiButton:hover {
background: white;
border-color: white;
color: #332735;
transform: translateY(-2px);
}
/* Quick Start Section */
.quickStart {
padding: 4rem 0;
background: var(--ifm-background-color);
}
.sectionTitle {
font-size: 2rem;
font-weight: 600;
margin-bottom: 0.75rem;
color: var(--ifm-color-emphasis-800);
}
.sectionDescription {
font-size: 1rem;
color: var(--ifm-color-emphasis-600);
margin-bottom: 1.5rem;
line-height: 1.5;
}
.codeBlock {
background: var(--ifm-color-gray-900);
border-radius: 8px;
padding: 1.5rem;
margin-top: 1.5rem;
box-shadow: 0 2px 10px rgba(0, 0, 0, 0.1);
}
.codeBlock pre {
margin: 0;
padding: 0;
background: none;
border: none;
}
.codeBlock code {
color: var(--ifm-color-gray-100);
font-family: 'Fira Code', 'Consolas', 'Monaco', monospace;
font-size: 0.9rem;
line-height: 1.6;
}
/* Features */
.features {
display: flex;
flex-direction: column;
gap: 1rem;
margin-top: 1.5rem;
}
.feature {
display: flex;
align-items: flex-start;
gap: 1rem;
padding: 1rem;
border-radius: 8px;
background: var(--ifm-color-gray-50);
border: 1px solid var(--ifm-color-gray-200);
transition: all 0.2s ease;
}
.feature:hover {
transform: translateY(-2px);
box-shadow: 0 8px 25px rgba(0, 0, 0, 0.1);
border-color: var(--ifm-color-primary-lighter);
}
.featureIcon {
font-size: 2rem;
width: 3rem;
height: 3rem;
display: flex;
align-items: center;
justify-content: center;
background: var(--ifm-color-primary-lightest);
border-radius: 50%;
flex-shrink: 0;
}
.feature h4 {
margin: 0 0 0.5rem 0;
font-size: 1.1rem;
font-weight: 600;
color: var(--ifm-color-emphasis-800);
}
.feature p {
margin: 0;
color: var(--ifm-color-emphasis-600);
line-height: 1.5;
}
/* Community Section */
.community {
padding: 3rem 0;
background: var(--ifm-color-gray-50);
border-top: 1px solid var(--ifm-color-gray-200);
}
.communityContent {
text-align: center;
max-width: 600px;
margin: 0 auto;
}
.communityLinks {
display: flex;
justify-content: center;
gap: 1rem;
margin-top: 2rem;
}
.communityButton {
display: flex;
align-items: center;
gap: 0.5rem;
font-weight: 600;
transition: all 0.3s ease;
}
.communityButton:hover {
transform: translateY(-2px);
box-shadow: 0 8px 25px rgba(0, 0, 0, 0.1);
}
.communityIcon {
font-size: 1.2rem;
}
/* Responsive Design */
@media screen and (max-width: 996px) {
.heroBanner {
padding: 3rem 2rem;
}
.heroTitle {
font-size: 2.2rem;
}
.heroSubtitle {
font-size: 1rem;
}
.buttons {
flex-direction: column;
gap: 1rem;
}
.quickStart {
padding: 3rem 0;
}
.sectionTitle {
font-size: 1.75rem;
}
.communityLinks {
flex-direction: column;
align-items: center;
}
.communityButton {
width: 200px;
justify-content: center;
}
}
@media screen and (max-width: 768px) {
.heroLogo {
height: 40px;
}
.heroTitle {
font-size: 1.8rem;
}
.codeBlock {
padding: 1rem;
}
.codeBlock code {
font-size: 0.8rem;
}
.feature {
padding: 0.75rem;
}
}

View file

@ -1,7 +0,0 @@
---
title: Markdown page example
---
# Markdown page example
You don't need React to write simple standalone pages.

View file

@ -358,16 +358,6 @@ def generate_index_docs(api_name: str, api_docstring: str | None, provider_entri
md_lines.append("")
md_lines.append(f"This section contains documentation for all available providers for the **{api_name}** API.")
md_lines.append("")
md_lines.append("## Providers")
md_lines.append("")
# For Docusaurus, create a simple list of links instead of toctree
for entry in provider_entries:
provider_name = entry["display_name"]
filename = entry["filename"]
md_lines.append(f"- [{provider_name}](./{filename})")
return "\n".join(md_lines) + "\n"