From 2a1a8133087017f4d84f0ec22a4d4874ab9cf8da Mon Sep 17 00:00:00 2001 From: ehhuang Date: Fri, 24 Oct 2025 13:57:28 -0700 Subject: [PATCH] chore: update docs for telemetry api removal (#3900) # What does this PR do? Telemetry is no longer an API/provider. ## Test Plan --- README.md | 58 +++++++++---------- docs/docs/building_applications/safety.mdx | 1 - docs/docs/concepts/apis/index.mdx | 1 - docs/docs/distributions/configuration.mdx | 21 +------ .../remote_hosted_distro/index.mdx | 6 +- .../remote_hosted_distro/watsonx.md | 1 - .../self_hosted_distro/dell-tgi.md | 6 +- .../distributions/self_hosted_distro/dell.md | 1 - .../self_hosted_distro/passthrough.md | 1 - .../self_hosted_distro/starter.md | 3 +- docs/docs/index.mdx | 2 +- docs/docs/providers/index.mdx | 1 - docs/docs/providers/telemetry/index.mdx | 10 ---- .../telemetry/inline_meta-reference.mdx | 27 --------- .../llama_stack_client_cli_reference.md | 2 - 15 files changed, 39 insertions(+), 102 deletions(-) delete mode 100644 docs/docs/providers/telemetry/index.mdx delete mode 100644 docs/docs/providers/telemetry/inline_meta-reference.mdx diff --git a/README.md b/README.md index bb8587855..639e7280d 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh Llama Stack standardizes the core building blocks that simplify AI application development. It codifies best practices across the Llama ecosystem. More specifically, it provides -- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry. +- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals. - **Plugin architecture** to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile. - **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment. - **Multiple developer interfaces** like CLI and SDKs for Python, Typescript, iOS, and Android. @@ -125,34 +125,34 @@ By reducing friction and complexity, Llama Stack empowers developers to focus on Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack. Please checkout for [full list](https://llamastack.github.io/docs/providers) -| API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO | -|:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:--------:| -| Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| SambaNova | Hosted | | ✅ | | ✅ | | | | | -| Cerebras | Hosted | | ✅ | | | | | | | -| Fireworks | Hosted | ✅ | ✅ | ✅ | | | | | | -| AWS Bedrock | Hosted | | ✅ | | ✅ | | | | | -| Together | Hosted | ✅ | ✅ | | ✅ | | | | | -| Groq | Hosted | | ✅ | | | | | | | -| Ollama | Single Node | | ✅ | | | | | | | -| TGI | Hosted/Single Node | | ✅ | | | | | | | -| NVIDIA NIM | Hosted/Single Node | | ✅ | | ✅ | | | | | -| ChromaDB | Hosted/Single Node | | | ✅ | | | | | | -| Milvus | Hosted/Single Node | | | ✅ | | | | | | -| Qdrant | Hosted/Single Node | | | ✅ | | | | | | -| Weaviate | Hosted/Single Node | | | ✅ | | | | | | -| SQLite-vec | Single Node | | | ✅ | | | | | | -| PG Vector | Single Node | | | ✅ | | | | | | -| PyTorch ExecuTorch | On-device iOS | ✅ | ✅ | | | | | | | -| vLLM | Single Node | | ✅ | | | | | | | -| OpenAI | Hosted | | ✅ | | | | | | | -| Anthropic | Hosted | | ✅ | | | | | | | -| Gemini | Hosted | | ✅ | | | | | | | -| WatsonX | Hosted | | ✅ | | | | | | | -| HuggingFace | Single Node | | | | | | ✅ | | ✅ | -| TorchTune | Single Node | | | | | | ✅ | | | -| NVIDIA NEMO | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ | -| NVIDIA | Hosted | | | | | | ✅ | ✅ | ✅ | +| API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Post Training | Eval | DatasetIO | +|:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:-------------:|:----:|:--------:| +| Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| SambaNova | Hosted | | ✅ | | ✅ | | | | +| Cerebras | Hosted | | ✅ | | | | | | +| Fireworks | Hosted | ✅ | ✅ | ✅ | | | | | +| AWS Bedrock | Hosted | | ✅ | | ✅ | | | | +| Together | Hosted | ✅ | ✅ | | ✅ | | | | +| Groq | Hosted | | ✅ | | | | | | +| Ollama | Single Node | | ✅ | | | | | | +| TGI | Hosted/Single Node | | ✅ | | | | | | +| NVIDIA NIM | Hosted/Single Node | | ✅ | | ✅ | | | | +| ChromaDB | Hosted/Single Node | | | ✅ | | | | | +| Milvus | Hosted/Single Node | | | ✅ | | | | | +| Qdrant | Hosted/Single Node | | | ✅ | | | | | +| Weaviate | Hosted/Single Node | | | ✅ | | | | | +| SQLite-vec | Single Node | | | ✅ | | | | | +| PG Vector | Single Node | | | ✅ | | | | | +| PyTorch ExecuTorch | On-device iOS | ✅ | ✅ | | | | | | +| vLLM | Single Node | | ✅ | | | | | | +| OpenAI | Hosted | | ✅ | | | | | | +| Anthropic | Hosted | | ✅ | | | | | | +| Gemini | Hosted | | ✅ | | | | | | +| WatsonX | Hosted | | ✅ | | | | | | +| HuggingFace | Single Node | | | | | ✅ | | ✅ | +| TorchTune | Single Node | | | | | ✅ | | | +| NVIDIA NEMO | Hosted | | ✅ | ✅ | | ✅ | ✅ | ✅ | +| NVIDIA | Hosted | | | | | ✅ | ✅ | ✅ | > **Note**: Additional providers are available through external packages. See [External Providers](https://llamastack.github.io/docs/providers/external) documentation. diff --git a/docs/docs/building_applications/safety.mdx b/docs/docs/building_applications/safety.mdx index 16fe5f6f8..998c02b20 100644 --- a/docs/docs/building_applications/safety.mdx +++ b/docs/docs/building_applications/safety.mdx @@ -391,5 +391,4 @@ client.shields.register( - **[Agents](./agent)** - Integrating safety shields with intelligent agents - **[Agent Execution Loop](./agent_execution_loop)** - Understanding safety in the execution flow - **[Evaluations](./evals)** - Evaluating safety shield effectiveness -- **[Telemetry](./telemetry)** - Monitoring safety violations and metrics - **[Llama Guard Documentation](https://github.com/meta-llama/PurpleLlama/tree/main/Llama-Guard3)** - Advanced safety model details diff --git a/docs/docs/concepts/apis/index.mdx b/docs/docs/concepts/apis/index.mdx index 6e699d137..11b8b2e08 100644 --- a/docs/docs/concepts/apis/index.mdx +++ b/docs/docs/concepts/apis/index.mdx @@ -16,7 +16,6 @@ A Llama Stack API is described as a collection of REST endpoints. We currently s - **Scoring**: evaluate outputs of the system - **Eval**: generate outputs (via Inference or Agents) and perform scoring - **VectorIO**: perform operations on vector stores, such as adding documents, searching, and deleting documents -- **Telemetry**: collect telemetry data from the system - **Post Training**: fine-tune a model - **Tool Runtime**: interact with various tools and protocols - **Responses**: generate responses from an LLM using this OpenAI compatible API. diff --git a/docs/docs/distributions/configuration.mdx b/docs/docs/distributions/configuration.mdx index bf3156865..910a0ed05 100644 --- a/docs/docs/distributions/configuration.mdx +++ b/docs/docs/distributions/configuration.mdx @@ -21,7 +21,6 @@ apis: - inference - vector_io - safety -- telemetry providers: inference: - provider_id: ollama @@ -51,10 +50,6 @@ providers: responses: backend: sql_default table_name: responses - telemetry: - - provider_id: meta-reference - provider_type: inline::meta-reference - config: {} storage: backends: kv_default: @@ -92,7 +87,6 @@ apis: - inference - vector_io - safety -- telemetry ``` ## Providers @@ -589,24 +583,13 @@ created by users sharing a team with them: In addition to resource-based access control, Llama Stack supports endpoint-level authorization using OAuth 2.0 style scopes. When authentication is enabled, specific API endpoints require users to have particular scopes in their authentication token. -**Scope-Gated APIs:** -The following APIs are currently gated by scopes: - -- **Telemetry API** (scope: `telemetry.read`): - - `POST /telemetry/traces` - Query traces - - `GET /telemetry/traces/{trace_id}` - Get trace by ID - - `GET /telemetry/traces/{trace_id}/spans/{span_id}` - Get span by ID - - `POST /telemetry/spans/{span_id}/tree` - Get span tree - - `POST /telemetry/spans` - Query spans - - `POST /telemetry/metrics/{metric_name}` - Query metrics - **Authentication Configuration:** For **JWT/OAuth2 providers**, scopes should be included in the JWT's claims: ```json { "sub": "user123", - "scope": "telemetry.read", + "scope": "", "aud": "llama-stack" } ``` @@ -616,7 +599,7 @@ For **custom authentication providers**, the endpoint must return user attribute { "principal": "user123", "attributes": { - "scopes": ["telemetry.read"] + "scopes": [""] } } ``` diff --git a/docs/docs/distributions/remote_hosted_distro/index.mdx b/docs/docs/distributions/remote_hosted_distro/index.mdx index ef5a83d8a..7fa9d1bf6 100644 --- a/docs/docs/distributions/remote_hosted_distro/index.mdx +++ b/docs/docs/distributions/remote_hosted_distro/index.mdx @@ -2,10 +2,10 @@ Remote-Hosted distributions are available endpoints serving Llama Stack API that you can directly connect to. -| Distribution | Endpoint | Inference | Agents | Memory | Safety | Telemetry | +| Distribution | Endpoint | Inference | Agents | Memory | Safety | |-------------|----------|-----------|---------|---------|---------|------------| -| Together | [https://llama-stack.together.ai](https://llama-stack.together.ai) | remote::together | meta-reference | remote::weaviate | meta-reference | meta-reference | -| Fireworks | [https://llamastack-preview.fireworks.ai](https://llamastack-preview.fireworks.ai) | remote::fireworks | meta-reference | remote::weaviate | meta-reference | meta-reference | +| Together | [https://llama-stack.together.ai](https://llama-stack.together.ai) | remote::together | meta-reference | remote::weaviate | meta-reference | +| Fireworks | [https://llamastack-preview.fireworks.ai](https://llamastack-preview.fireworks.ai) | remote::fireworks | meta-reference | remote::weaviate | meta-reference | ## Connecting to Remote-Hosted Distributions diff --git a/docs/docs/distributions/remote_hosted_distro/watsonx.md b/docs/docs/distributions/remote_hosted_distro/watsonx.md index 5add678f3..2ec7fe965 100644 --- a/docs/docs/distributions/remote_hosted_distro/watsonx.md +++ b/docs/docs/distributions/remote_hosted_distro/watsonx.md @@ -21,7 +21,6 @@ The `llamastack/distribution-watsonx` distribution consists of the following pro | inference | `remote::watsonx`, `inline::sentence-transformers` | | safety | `inline::llama-guard` | | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` | -| telemetry | `inline::meta-reference` | | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime`, `remote::model-context-protocol` | | vector_io | `inline::faiss` | diff --git a/docs/docs/distributions/self_hosted_distro/dell-tgi.md b/docs/docs/distributions/self_hosted_distro/dell-tgi.md index 5fca297b0..a49bab4e6 100644 --- a/docs/docs/distributions/self_hosted_distro/dell-tgi.md +++ b/docs/docs/distributions/self_hosted_distro/dell-tgi.md @@ -13,9 +13,9 @@ self The `llamastack/distribution-tgi` distribution consists of the following provider configurations. -| **API** | **Inference** | **Agents** | **Memory** | **Safety** | **Telemetry** | -|----------------- |--------------- |---------------- |-------------------------------------------------- |---------------- |---------------- | -| **Provider(s)** | remote::tgi | meta-reference | meta-reference, remote::pgvector, remote::chroma | meta-reference | meta-reference | +| **API** | **Inference** | **Agents** | **Memory** | **Safety** | +|----------------- |--------------- |---------------- |-------------------------------------------------- |---------------- | +| **Provider(s)** | remote::tgi | meta-reference | meta-reference, remote::pgvector, remote::chroma | meta-reference | The only difference vs. the `tgi` distribution is that it runs the Dell-TGI server for inference. diff --git a/docs/docs/distributions/self_hosted_distro/dell.md b/docs/docs/distributions/self_hosted_distro/dell.md index 040eb4a12..e30df5164 100644 --- a/docs/docs/distributions/self_hosted_distro/dell.md +++ b/docs/docs/distributions/self_hosted_distro/dell.md @@ -22,7 +22,6 @@ The `llamastack/distribution-dell` distribution consists of the following provid | inference | `remote::tgi`, `inline::sentence-transformers` | | safety | `inline::llama-guard` | | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` | -| telemetry | `inline::meta-reference` | | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime` | | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` | diff --git a/docs/docs/distributions/self_hosted_distro/passthrough.md b/docs/docs/distributions/self_hosted_distro/passthrough.md index 39f076be4..13e78a1ee 100644 --- a/docs/docs/distributions/self_hosted_distro/passthrough.md +++ b/docs/docs/distributions/self_hosted_distro/passthrough.md @@ -21,7 +21,6 @@ The `llamastack/distribution-passthrough` distribution consists of the following | inference | `remote::passthrough`, `inline::sentence-transformers` | | safety | `inline::llama-guard` | | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` | -| telemetry | `inline::meta-reference` | | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `remote::wolfram-alpha`, `inline::rag-runtime`, `remote::model-context-protocol` | | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` | diff --git a/docs/docs/distributions/self_hosted_distro/starter.md b/docs/docs/distributions/self_hosted_distro/starter.md index e04c5874b..f6786a95c 100644 --- a/docs/docs/distributions/self_hosted_distro/starter.md +++ b/docs/docs/distributions/self_hosted_distro/starter.md @@ -26,7 +26,6 @@ The starter distribution consists of the following provider configurations: | inference | `remote::openai`, `remote::fireworks`, `remote::together`, `remote::ollama`, `remote::anthropic`, `remote::gemini`, `remote::groq`, `remote::sambanova`, `remote::vllm`, `remote::tgi`, `remote::cerebras`, `remote::llama-openai-compat`, `remote::nvidia`, `remote::hf::serverless`, `remote::hf::endpoint`, `inline::sentence-transformers` | | safety | `inline::llama-guard` | | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` | -| telemetry | `inline::meta-reference` | | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime`, `remote::model-context-protocol` | | vector_io | `inline::faiss`, `inline::sqlite-vec`, `inline::milvus`, `remote::chromadb`, `remote::pgvector` | @@ -119,7 +118,7 @@ The following environment variables can be configured: ### Telemetry Configuration - `OTEL_SERVICE_NAME`: OpenTelemetry service name -- `TELEMETRY_SINKS`: Telemetry sinks (default: `[]`) +- `OTEL_EXPORTER_OTLP_ENDPOINT`: OpenTelemetry collector endpoint URL ## Enabling Providers diff --git a/docs/docs/index.mdx b/docs/docs/index.mdx index 80b288872..8c17283f9 100644 --- a/docs/docs/index.mdx +++ b/docs/docs/index.mdx @@ -29,7 +29,7 @@ Llama Stack is now available! See the [release notes](https://github.com/llamast Llama Stack defines and standardizes the core building blocks needed to bring generative AI applications to market. It provides a unified set of APIs with implementations from leading service providers, enabling seamless transitions between development and production environments. More specifically, it provides: -- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry. +- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals. - **Plugin architecture** to support the rich ecosystem of implementations of the different APIs in different environments like local development, on-premises, cloud, and mobile. - **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment - **Multiple developer interfaces** like CLI and SDKs for Python, Node, iOS, and Android diff --git a/docs/docs/providers/index.mdx b/docs/docs/providers/index.mdx index 2ca2b2697..bfc16b29a 100644 --- a/docs/docs/providers/index.mdx +++ b/docs/docs/providers/index.mdx @@ -26,7 +26,6 @@ Importantly, Llama Stack always strives to provide at least one fully inline pro - **[Agents](agents/index.mdx)** - Agentic system providers - **[DatasetIO](datasetio/index.mdx)** - Dataset and data loader providers - **[Safety](safety/index.mdx)** - Content moderation and safety providers -- **[Telemetry](telemetry/index.mdx)** - Monitoring and observability providers - **[Vector IO](vector_io/index.mdx)** - Vector database providers - **[Tool Runtime](tool_runtime/index.mdx)** - Tool and protocol providers - **[Files](files/index.mdx)** - File system and storage providers diff --git a/docs/docs/providers/telemetry/index.mdx b/docs/docs/providers/telemetry/index.mdx deleted file mode 100644 index 07190d625..000000000 --- a/docs/docs/providers/telemetry/index.mdx +++ /dev/null @@ -1,10 +0,0 @@ ---- -sidebar_label: Telemetry -title: Telemetry ---- - -# Telemetry - -## Overview - -This section contains documentation for all available providers for the **telemetry** API. diff --git a/docs/docs/providers/telemetry/inline_meta-reference.mdx b/docs/docs/providers/telemetry/inline_meta-reference.mdx deleted file mode 100644 index d8b3157d1..000000000 --- a/docs/docs/providers/telemetry/inline_meta-reference.mdx +++ /dev/null @@ -1,27 +0,0 @@ ---- -description: "Meta's reference implementation of telemetry and observability using OpenTelemetry." -sidebar_label: Meta-Reference -title: inline::meta-reference ---- - -# inline::meta-reference - -## Description - -Meta's reference implementation of telemetry and observability using OpenTelemetry. - -## Configuration - -| Field | Type | Required | Default | Description | -|-------|------|----------|---------|-------------| -| `otel_exporter_otlp_endpoint` | `str \| None` | No | | The OpenTelemetry collector endpoint URL (base URL for traces, metrics, and logs). If not set, the SDK will use OTEL_EXPORTER_OTLP_ENDPOINT environment variable. | -| `service_name` | `` | No | ​ | The service name to use for telemetry | -| `sinks` | `list[inline.telemetry.meta_reference.config.TelemetrySink` | No | [] | List of telemetry sinks to enable (possible values: otel_trace, otel_metric, console) | - -## Sample Configuration - -```yaml -service_name: "${env.OTEL_SERVICE_NAME:=\u200B}" -sinks: ${env.TELEMETRY_SINKS:=} -otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=} -``` diff --git a/docs/docs/references/llama_stack_client_cli_reference.md b/docs/docs/references/llama_stack_client_cli_reference.md index a4321938a..fd87e7dbd 100644 --- a/docs/docs/references/llama_stack_client_cli_reference.md +++ b/docs/docs/references/llama_stack_client_cli_reference.md @@ -78,8 +78,6 @@ llama-stack-client providers list +-----------+----------------+-----------------+ | agents | meta-reference | meta-reference | +-----------+----------------+-----------------+ -| telemetry | meta-reference | meta-reference | -+-----------+----------------+-----------------+ | safety | meta-reference | meta-reference | +-----------+----------------+-----------------+ ```