mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
# What does this PR do? Initial PR against #4123 Adds `parallel_tool_calls` spec to Responses API and basic initial implementation where no more than one function call is generated when set to `False`. ## Test Plan * Unit tests have been added to verify no more than one function call is generated. * A followup PR will verify passing through `parallel_tool_calls` to providers. * A followup PR will address verification and/or implementation of incremental function calling across multiple conversational turns. --------- Signed-off-by: Anastas Stoyanovsky <astoyano@redhat.com> |
||
|---|---|---|
| .. | ||
| common | ||
| __init__.py | ||
| agents.py | ||
| batches.py | ||
| benchmarks.py | ||
| conversations.py | ||
| datasetio.py | ||
| datasets.py | ||
| datatypes.py | ||
| eval.py | ||
| files.py | ||
| inference.py | ||
| inspect.py | ||
| models.py | ||
| openai_responses.py | ||
| post_training.py | ||
| prompts.py | ||
| providers.py | ||
| py.typed | ||
| pyproject.toml | ||
| rag_tool.py | ||
| README.md | ||
| resource.py | ||
| safety.py | ||
| schema_utils.py | ||
| scoring.py | ||
| scoring_functions.py | ||
| shields.py | ||
| tools.py | ||
| uv.lock | ||
| vector_io.py | ||
| vector_stores.py | ||
| version.py | ||
llama-stack-api
API and Provider specifications for Llama Stack - a lightweight package with protocol definitions and provider specs.
Overview
llama-stack-api is a minimal dependency package that contains:
- API Protocol Definitions: Type-safe protocol definitions for all Llama Stack APIs (inference, agents, safety, etc.)
- Provider Specifications: Provider spec definitions for building custom providers
- Data Types: Shared data types and models used across the Llama Stack ecosystem
- Type Utilities: Strong typing utilities and schema validation
What This Package Does NOT Include
- Server implementation (see
llama-stackpackage) - Provider implementations (see
llama-stackpackage) - CLI tools (see
llama-stackpackage) - Runtime orchestration (see
llama-stackpackage)
Use Cases
This package is designed for:
- Third-party Provider Developers: Build custom providers without depending on the full Llama Stack server
- Client Library Authors: Use type definitions without server dependencies
- Documentation Generation: Generate API docs from protocol definitions
- Type Checking: Validate implementations against the official specs
Installation
pip install llama-stack-api
Or with uv:
uv pip install llama-stack-api
Dependencies
Minimal dependencies:
pydantic>=2.11.9- For data validation and serializationjsonschema- For JSON schema utilities
Versioning
This package follows semantic versioning independently from the main llama-stack package:
- Patch versions (0.1.x): Documentation, internal improvements
- Minor versions (0.x.0): New APIs, backward-compatible changes
- Major versions (x.0.0): Breaking changes to existing APIs
Current version: 0.4.0.dev0
Usage Example
from llama_stack_api.inference import Inference, ChatCompletionRequest
from llama_stack_api.providers.datatypes import ProviderSpec, InlineProviderSpec
from llama_stack_api.datatypes import Api
# Use protocol definitions for type checking
class MyInferenceProvider(Inference):
async def chat_completion(self, request: ChatCompletionRequest):
# Your implementation
pass
# Define provider specifications
my_provider_spec = InlineProviderSpec(
api=Api.inference,
provider_type="inline::my-provider",
pip_packages=["my-dependencies"],
module="my_package.providers.inference",
config_class="my_package.providers.inference.MyConfig",
)
Relationship to llama-stack
The main llama-stack package depends on llama-stack-api and provides:
- Full server implementation
- Built-in provider implementations
- CLI tools for running and managing stacks
- Runtime provider resolution and orchestration
Contributing
See the main Llama Stack repository for contribution guidelines.
License
MIT License - see LICENSE file for details.