mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Anastas Stoyanovsky a3580e6bc0 feat!: Wire through parallel_tool_calls to Responses API (#4124 ) # What does this PR do? Initial PR against #4123 Adds `parallel_tool_calls` spec to Responses API and basic initial implementation where no more than one function call is generated when set to `False`. ## Test Plan * Unit tests have been added to verify no more than one function call is generated. * A followup PR will verify passing through `parallel_tool_calls` to providers. * A followup PR will address verification and/or implementation of incremental function calling across multiple conversational turns. --------- Signed-off-by: Anastas Stoyanovsky <astoyano@redhat.com>		2025-11-18 11:25:08 -08:00
..
common	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
__init__.py	fix: list-deps command (#4174 )	2025-11-17 15:26:10 +01:00
agents.py	feat!: Wire through parallel_tool_calls to Responses API (#4124 )	2025-11-18 11:25:08 -08:00
batches.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
benchmarks.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
conversations.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
datasetio.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
datasets.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
datatypes.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
eval.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
files.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
inference.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
inspect.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
models.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
openai_responses.py	feat!: Wire through parallel_tool_calls to Responses API (#4124 )	2025-11-18 11:25:08 -08:00
post_training.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
prompts.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
providers.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
py.typed	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
pyproject.toml	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
rag_tool.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
README.md	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
resource.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
safety.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
schema_utils.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
scoring.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
scoring_functions.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
shields.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
tools.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
uv.lock	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
vector_io.py	feat(openapi): switch to fastapi-based generator (#3944 )	2025-11-14 15:53:53 -08:00
vector_stores.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
version.py	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00

README.md

llama-stack-api

API and Provider specifications for Llama Stack - a lightweight package with protocol definitions and provider specs.

Overview

llama-stack-api is a minimal dependency package that contains:

API Protocol Definitions: Type-safe protocol definitions for all Llama Stack APIs (inference, agents, safety, etc.)
Provider Specifications: Provider spec definitions for building custom providers
Data Types: Shared data types and models used across the Llama Stack ecosystem
Type Utilities: Strong typing utilities and schema validation

What This Package Does NOT Include

Server implementation (see llama-stack package)
Provider implementations (see llama-stack package)
CLI tools (see llama-stack package)
Runtime orchestration (see llama-stack package)

Use Cases

This package is designed for:

Third-party Provider Developers: Build custom providers without depending on the full Llama Stack server
Client Library Authors: Use type definitions without server dependencies
Documentation Generation: Generate API docs from protocol definitions
Type Checking: Validate implementations against the official specs

Installation

pip install llama-stack-api

Or with uv:

uv pip install llama-stack-api

Dependencies

Minimal dependencies:

pydantic>=2.11.9 - For data validation and serialization
jsonschema - For JSON schema utilities

Versioning

This package follows semantic versioning independently from the main llama-stack package:

Patch versions (0.1.x): Documentation, internal improvements
Minor versions (0.x.0): New APIs, backward-compatible changes
Major versions (x.0.0): Breaking changes to existing APIs

Current version: 0.4.0.dev0

Usage Example

from llama_stack_api.inference import Inference, ChatCompletionRequest
from llama_stack_api.providers.datatypes import ProviderSpec, InlineProviderSpec
from llama_stack_api.datatypes import Api


# Use protocol definitions for type checking
class MyInferenceProvider(Inference):
    async def chat_completion(self, request: ChatCompletionRequest):
        # Your implementation
        pass


# Define provider specifications
my_provider_spec = InlineProviderSpec(
    api=Api.inference,
    provider_type="inline::my-provider",
    pip_packages=["my-dependencies"],
    module="my_package.providers.inference",
    config_class="my_package.providers.inference.MyConfig",
)

Relationship to llama-stack

The main llama-stack package depends on llama-stack-api and provides:

Full server implementation
Built-in provider implementations
CLI tools for running and managing stacks
Runtime provider resolution and orchestration

Contributing

See the main Llama Stack repository for contribution guidelines.

License

MIT License - see LICENSE file for details.