mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-04 02:03:44 +00:00
Merge remote-tracking branch 'origin/main' into kill_client
This commit is contained in:
commit
7d115d3b5c
642 changed files with 92929 additions and 1793 deletions
|
|
@ -35,9 +35,6 @@ Here are the key topics that will help you build effective AI applications:
|
|||
- **[Telemetry](./telemetry.mdx)** - Monitor and analyze your agents' performance and behavior
|
||||
- **[Safety](./safety.mdx)** - Implement guardrails and safety measures to ensure responsible AI behavior
|
||||
|
||||
### 🎮 **Interactive Development**
|
||||
- **[Playground](./playground.mdx)** - Interactive environment for testing and developing applications
|
||||
|
||||
## Application Patterns
|
||||
|
||||
### 🤖 **Conversational Agents**
|
||||
|
|
|
|||
|
|
@ -1,298 +0,0 @@
|
|||
---
|
||||
title: Llama Stack Playground
|
||||
description: Interactive interface to explore and experiment with Llama Stack capabilities
|
||||
sidebar_label: Playground
|
||||
sidebar_position: 10
|
||||
---
|
||||
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
# Llama Stack Playground
|
||||
|
||||
:::note[Experimental Feature]
|
||||
The Llama Stack Playground is currently experimental and subject to change. We welcome feedback and contributions to help improve it.
|
||||
:::
|
||||
|
||||
The Llama Stack Playground is a simple interface that aims to:
|
||||
- **Showcase capabilities and concepts** of Llama Stack in an interactive environment
|
||||
- **Demo end-to-end application code** to help users get started building their own applications
|
||||
- **Provide a UI** to help users inspect and understand Llama Stack API providers and resources
|
||||
|
||||
## Key Features
|
||||
|
||||
### Interactive Playground Pages
|
||||
|
||||
The playground provides interactive pages for users to explore Llama Stack API capabilities:
|
||||
|
||||
#### Chatbot Interface
|
||||
|
||||
<video
|
||||
controls
|
||||
autoPlay
|
||||
playsInline
|
||||
muted
|
||||
loop
|
||||
style={{width: '100%'}}
|
||||
>
|
||||
<source src="https://github.com/user-attachments/assets/8d2ef802-5812-4a28-96e1-316038c84cbf" type="video/mp4" />
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="chat" label="Chat">
|
||||
|
||||
**Simple Chat Interface**
|
||||
- Chat directly with Llama models through an intuitive interface
|
||||
- Uses the `/chat/completions` streaming API under the hood
|
||||
- Real-time message streaming for responsive interactions
|
||||
- Perfect for testing model capabilities and prompt engineering
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="rag" label="RAG Chat">
|
||||
|
||||
**Document-Aware Conversations**
|
||||
- Upload documents to create memory banks
|
||||
- Chat with a RAG-enabled agent that can query your documents
|
||||
- Uses Llama Stack's `/agents` API to create and manage RAG sessions
|
||||
- Ideal for exploring knowledge-enhanced AI applications
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
#### Evaluation Interface
|
||||
|
||||
<video
|
||||
controls
|
||||
autoPlay
|
||||
playsInline
|
||||
muted
|
||||
loop
|
||||
style={{width: '100%'}}
|
||||
>
|
||||
<source src="https://github.com/user-attachments/assets/6cc1659f-eba4-49ca-a0a5-7c243557b4f5" type="video/mp4" />
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="scoring" label="Scoring Evaluations">
|
||||
|
||||
**Custom Dataset Evaluation**
|
||||
- Upload your own evaluation datasets
|
||||
- Run evaluations using available scoring functions
|
||||
- Uses Llama Stack's `/scoring` API for flexible evaluation workflows
|
||||
- Great for testing application performance on custom metrics
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="benchmarks" label="Benchmark Evaluations">
|
||||
|
||||
<video
|
||||
controls
|
||||
autoPlay
|
||||
playsInline
|
||||
muted
|
||||
loop
|
||||
style={{width: '100%', marginBottom: '1rem'}}
|
||||
>
|
||||
<source src="https://github.com/user-attachments/assets/345845c7-2a2b-4095-960a-9ae40f6a93cf" type="video/mp4" />
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
|
||||
**Pre-registered Evaluation Tasks**
|
||||
- Evaluate models or agents on pre-defined tasks
|
||||
- Uses Llama Stack's `/eval` API for comprehensive evaluation
|
||||
- Combines datasets and scoring functions for standardized testing
|
||||
|
||||
**Setup Requirements:**
|
||||
Register evaluation datasets and benchmarks first:
|
||||
|
||||
```bash
|
||||
# Register evaluation dataset
|
||||
llama-stack-client datasets register \
|
||||
--dataset-id "mmlu" \
|
||||
--provider-id "huggingface" \
|
||||
--url "https://huggingface.co/datasets/llamastack/evals" \
|
||||
--metadata '{"path": "llamastack/evals", "name": "evals__mmlu__details", "split": "train"}' \
|
||||
--schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string"}, "chat_completion_input": {"type": "string"}}'
|
||||
|
||||
# Register benchmark task
|
||||
llama-stack-client benchmarks register \
|
||||
--eval-task-id meta-reference-mmlu \
|
||||
--provider-id meta-reference \
|
||||
--dataset-id mmlu \
|
||||
--scoring-functions basic::regex_parser_multiple_choice_answer
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
#### Inspection Interface
|
||||
|
||||
<video
|
||||
controls
|
||||
autoPlay
|
||||
playsInline
|
||||
muted
|
||||
loop
|
||||
style={{width: '100%'}}
|
||||
>
|
||||
<source src="https://github.com/user-attachments/assets/01d52b2d-92af-4e3a-b623-a9b8ba22ba99" type="video/mp4" />
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="providers" label="API Providers">
|
||||
|
||||
**Provider Management**
|
||||
- Inspect available Llama Stack API providers
|
||||
- View provider configurations and capabilities
|
||||
- Uses the `/providers` API for real-time provider information
|
||||
- Essential for understanding your deployment's capabilities
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="resources" label="API Resources">
|
||||
|
||||
**Resource Exploration**
|
||||
- Inspect Llama Stack API resources including:
|
||||
- **Models**: Available language models
|
||||
- **Datasets**: Registered evaluation datasets
|
||||
- **Memory Banks**: Vector databases and knowledge stores
|
||||
- **Benchmarks**: Evaluation tasks and scoring functions
|
||||
- **Shields**: Safety and content moderation tools
|
||||
- Uses `/<resources>/list` APIs for comprehensive resource visibility
|
||||
- For detailed information about resources, see [Core Concepts](/docs/concepts)
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Getting Started
|
||||
|
||||
### Quick Start Guide
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="setup" label="Setup">
|
||||
|
||||
**1. Start the Llama Stack API Server**
|
||||
|
||||
```bash
|
||||
llama stack list-deps together | xargs -L1 uv pip install
|
||||
llama stack run together
|
||||
```
|
||||
|
||||
**2. Start the Streamlit UI**
|
||||
|
||||
```bash
|
||||
# Launch the playground interface
|
||||
uv run --with ".[ui]" streamlit run llama_stack.core/ui/app.py
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="usage" label="Usage Tips">
|
||||
|
||||
**Making the Most of the Playground:**
|
||||
|
||||
- **Start with Chat**: Test basic model interactions and prompt engineering
|
||||
- **Explore RAG**: Upload sample documents to see knowledge-enhanced responses
|
||||
- **Try Evaluations**: Use the scoring interface to understand evaluation metrics
|
||||
- **Inspect Resources**: Check what providers and resources are available
|
||||
- **Experiment with Settings**: Adjust parameters to see how they affect results
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
### Available Distributions
|
||||
|
||||
The playground works with any Llama Stack distribution. Popular options include:
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="together" label="Together AI">
|
||||
|
||||
```bash
|
||||
llama stack list-deps together | xargs -L1 uv pip install
|
||||
llama stack run together
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Cloud-hosted models
|
||||
- Fast inference
|
||||
- Multiple model options
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="ollama" label="Ollama (Local)">
|
||||
|
||||
```bash
|
||||
llama stack list-deps ollama | xargs -L1 uv pip install
|
||||
llama stack run ollama
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Local model execution
|
||||
- Privacy-focused
|
||||
- No internet required
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="meta-reference" label="Meta Reference">
|
||||
|
||||
```bash
|
||||
llama stack list-deps meta-reference | xargs -L1 uv pip install
|
||||
llama stack run meta-reference
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Reference implementation
|
||||
- All API features available
|
||||
- Best for development
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Use Cases & Examples
|
||||
|
||||
### Educational Use Cases
|
||||
- **Learning Llama Stack**: Hands-on exploration of API capabilities
|
||||
- **Prompt Engineering**: Interactive testing of different prompting strategies
|
||||
- **RAG Experimentation**: Understanding how document retrieval affects responses
|
||||
- **Evaluation Understanding**: See how different metrics evaluate model performance
|
||||
|
||||
### Development Use Cases
|
||||
- **Prototype Testing**: Quick validation of application concepts
|
||||
- **API Exploration**: Understanding available endpoints and parameters
|
||||
- **Integration Planning**: Seeing how different components work together
|
||||
- **Demo Creation**: Showcasing Llama Stack capabilities to stakeholders
|
||||
|
||||
### Research Use Cases
|
||||
- **Model Comparison**: Side-by-side testing of different models
|
||||
- **Evaluation Design**: Understanding how scoring functions work
|
||||
- **Safety Testing**: Exploring shield effectiveness with different inputs
|
||||
- **Performance Analysis**: Measuring model behavior across different scenarios
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 🚀 **Getting Started**
|
||||
- Begin with simple chat interactions to understand basic functionality
|
||||
- Gradually explore more advanced features like RAG and evaluations
|
||||
- Use the inspection tools to understand your deployment's capabilities
|
||||
|
||||
### 🔧 **Development Workflow**
|
||||
- Use the playground to prototype before writing application code
|
||||
- Test different parameter settings interactively
|
||||
- Validate evaluation approaches before implementing them programmatically
|
||||
|
||||
### 📊 **Evaluation & Testing**
|
||||
- Start with simple scoring functions before trying complex evaluations
|
||||
- Use the playground to understand evaluation results before automation
|
||||
- Test safety features with various input types
|
||||
|
||||
### 🎯 **Production Preparation**
|
||||
- Use playground insights to inform your production API usage
|
||||
- Test edge cases and error conditions interactively
|
||||
- Validate resource configurations before deployment
|
||||
|
||||
## Related Resources
|
||||
|
||||
- **[Getting Started Guide](../getting_started/quickstart)** - Complete setup and introduction
|
||||
- **[Core Concepts](/docs/concepts)** - Understanding Llama Stack fundamentals
|
||||
- **[Agents](./agent)** - Building intelligent agents
|
||||
- **[RAG (Retrieval Augmented Generation)](./rag)** - Knowledge-enhanced applications
|
||||
- **[Evaluations](./evals)** - Comprehensive evaluation framework
|
||||
- **[API Reference](/docs/api/llama-stack-specification)** - Complete API documentation
|
||||
|
|
@ -1,5 +1,5 @@
|
|||
---
|
||||
description: "AWS Bedrock inference provider for accessing various AI models through AWS's managed service."
|
||||
description: "AWS Bedrock inference provider using OpenAI compatible endpoint."
|
||||
sidebar_label: Remote - Bedrock
|
||||
title: remote::bedrock
|
||||
---
|
||||
|
|
@ -8,7 +8,7 @@ title: remote::bedrock
|
|||
|
||||
## Description
|
||||
|
||||
AWS Bedrock inference provider for accessing various AI models through AWS's managed service.
|
||||
AWS Bedrock inference provider using OpenAI compatible endpoint.
|
||||
|
||||
## Configuration
|
||||
|
||||
|
|
@ -16,19 +16,12 @@ AWS Bedrock inference provider for accessing various AI models through AWS's man
|
|||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `aws_access_key_id` | `str \| None` | No | | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
|
||||
| `aws_secret_access_key` | `str \| None` | No | | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
|
||||
| `aws_session_token` | `str \| None` | No | | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
|
||||
| `region_name` | `str \| None` | No | | The default AWS Region to use, for example, us-west-1 or us-west-2.Default use environment variable: AWS_DEFAULT_REGION |
|
||||
| `profile_name` | `str \| None` | No | | The profile name that contains credentials to use.Default use environment variable: AWS_PROFILE |
|
||||
| `total_max_attempts` | `int \| None` | No | | An integer representing the maximum number of attempts that will be made for a single request, including the initial attempt. Default use environment variable: AWS_MAX_ATTEMPTS |
|
||||
| `retry_mode` | `str \| None` | No | | A string representing the type of retries Boto3 will perform.Default use environment variable: AWS_RETRY_MODE |
|
||||
| `connect_timeout` | `float \| None` | No | 60.0 | The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds. |
|
||||
| `read_timeout` | `float \| None` | No | 60.0 | The time in seconds till a timeout exception is thrown when attempting to read from a connection.The default is 60 seconds. |
|
||||
| `session_ttl` | `int \| None` | No | 3600 | The time in seconds till a session expires. The default is 3600 seconds (1 hour). |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `region_name` | `<class 'str'>` | No | us-east-2 | AWS Region for the Bedrock Runtime endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
{}
|
||||
api_key: ${env.AWS_BEDROCK_API_KEY:=}
|
||||
region_name: ${env.AWS_DEFAULT_REGION:=us-east-2}
|
||||
```
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue