mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
docs: Add comprehensive Files API and Vector Store integration documentation - Add Files API documentation with OpenAI-compatible endpoints - Create comprehensive guide for OpenAI-compatible file operations - Reorganize documentation structure: move file operations to files/ directory - Add vector store provider documentation for Milvus, SQLite-vec, FAISS - Clean up redundant files and improve navigation - Update cross-references and eliminate documentation duplication - Support for release 0.2.14 FileResponse and Vector Store API features # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
41 lines
2.2 KiB
Text
41 lines
2.2 KiB
Text
---
|
|
title: APIs
|
|
description: Available REST APIs and planned capabilities in Llama Stack
|
|
sidebar_label: APIs
|
|
sidebar_position: 1
|
|
---
|
|
|
|
# APIs
|
|
|
|
A Llama Stack API is described as a collection of REST endpoints following OpenAI API standards. We currently support the following APIs:
|
|
|
|
- **Inference**: run inference with a LLM
|
|
- **Safety**: apply safety policies to the output at a Systems (not only model) level
|
|
- **Agents**: run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
|
|
- **DatasetIO**: interface with datasets and data loaders
|
|
- **Scoring**: evaluate outputs of the system
|
|
- **Eval**: generate outputs (via Inference or Agents) and perform scoring
|
|
- **VectorIO**: perform operations on vector stores, such as adding documents, searching, and deleting documents
|
|
- **Files**: manage file uploads, storage, and retrieval
|
|
- **Telemetry**: collect telemetry data from the system
|
|
- **Post Training**: fine-tune a model
|
|
- **Tool Runtime**: interact with various tools and protocols
|
|
- **Responses**: generate responses from an LLM
|
|
|
|
We are working on adding a few more APIs to complete the application lifecycle. These will include:
|
|
- **Batch Inference**: run inference on a dataset of inputs
|
|
- **Batch Agents**: run agents on a dataset of inputs
|
|
- **Batches**: OpenAI-compatible batch management for inference
|
|
|
|
|
|
## OpenAI API Compatibility
|
|
We are working on adding OpenAI API compatibility to Llama Stack. This will allow you to use Llama Stack with OpenAI API clients and tools.
|
|
|
|
### File Operations and Vector Store Integration
|
|
|
|
The Files API and Vector Store APIs work together through file operations, enabling automatic document processing and search. This integration implements the [OpenAI Vector Store Files API specification](https://platform.openai.com/docs/api-reference/vector-stores-files) and allows you to:
|
|
- Upload documents through the Files API
|
|
- Automatically process and chunk documents into searchable vectors
|
|
- Store processed content in vector databases based on the availability of [our providers](../../providers/index.mdx)
|
|
- Search through documents using natural language queries
|
|
For detailed information about this integration, see [File Operations and Vector Store Integration](../file_operations_vector_stores.md).
|