phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-25 21:57:45 +00:00

Matthew Farrellee e1ed152779

Coverage Badge / unit-tests (push) Failing after 3s

Details

Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s

Details

Python Package Build Test / build (3.12) (push) Failing after 3s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s

Details

Integration Tests / discover-tests (push) Successful in 7s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s

Details

Python Package Build Test / build (3.13) (push) Failing after 2s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s

Details

Unit Tests / unit-tests (3.12) (push) Failing after 8s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 11s

Details

Test External Providers / test-external-providers (venv) (push) Failing after 8s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s

Details

SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s

Details

Unit Tests / unit-tests (3.13) (push) Failing after 12s

Details

Update ReadTheDocs / update-readthedocs (push) Failing after 11s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s

Details

SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s

Details

Integration Tests / test-matrix (push) Failing after 18s

Details

Pre-commit / pre-commit (push) Successful in 1m14s

Details

chore: create OpenAIMixin for inference providers with an OpenAI-compat API that need to implement openai_* methods (#2835 )

# What does this PR do?

add an `OpenAIMixin` for use by inference providers who remote endpoints
support an OpenAI compatible API.

use is demonstrated by refactoring
- OpenAIInferenceAdapter
- NVIDIAInferenceAdapter (adds embedding support)
- LlamaCompatInferenceAdapter

## Test Plan

existing unit and integration tests

2025-07-23 06:49:40 -04:00

4.5 KiB

Raw Blame History

Adding a New API Provider

This guide will walk you through the process of adding a new API provider to Llama Stack.

Begin by reviewing the core concepts of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
Determine the provider type ({repopath}Remote::llama_stack/providers/remote or {repopath}Inline::llama_stack/providers/inline). Remote providers make requests to external services, while inline providers execute implementation locally.
Add your provider to the appropriate {repopath}Registry::llama_stack/providers/registry/. Specify pip dependencies necessary.
Update any distribution {repopath}Templates::llama_stack/templates/ build.yaml and run.yaml files if they should include your provider by default. Run {repopath}./scripts/distro_codegen.py if necessary. Note that distro_codegen.py will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's get_distribution_template() code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.

Here are some example PRs to help you get started:

Inference Provider Patterns

When implementing Inference providers for OpenAI-compatible APIs, Llama Stack provides several mixin classes to simplify development and ensure consistent behavior across providers.

OpenAIMixin

The OpenAIMixin class provides direct OpenAI API functionality for providers that work with OpenAI-compatible endpoints. It includes:

Direct API Methods

openai_completion(): Legacy text completion API with full parameter support
openai_chat_completion(): Chat completion API supporting streaming, tools, and function calling
openai_embeddings(): Text embeddings generation with customizable encoding and dimensions

Model Management

check_model_availability(): Queries the API endpoint to verify if a model exists and is accessible

Client Management

client property: Automatically creates and configures AsyncOpenAI client instances using your provider's credentials

Required Implementation

To use OpenAIMixin, your provider must implement these abstract methods:

@abstractmethod
def get_api_key(self) -> str:
    """Return the API key for authentication"""
    pass


@abstractmethod
def get_base_url(self) -> str:
    """Return the OpenAI-compatible API base URL"""
    pass

Testing the Provider

Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the together distribution, you should install dependencies via llama stack build --template together.

1. Integration Testing

Integration tests are located in {repopath}tests/integration. These tests use the python client-SDK APIs (from the llama_stack_client package) to test functionality. Since these tests use client APIs, they can be run either by pointing to an instance of the Llama Stack server or "inline" by using LlamaStackAsLibraryClient.

Consult {repopath}tests/integration/README.md for more details on how to run the tests.

Note that each provider's sample_run_config() method (in the configuration class for that provider) typically references some environment variables for specifying API keys and the like. You can set these in the environment or pass these via the --env flag to the test command.

2. Unit Testing

Unit tests are located in {repopath}tests/unit. Provider-specific unit tests are located in {repopath}tests/unit/providers. These tests are all run automatically as part of the CI process.

Consult {repopath}tests/unit/README.md for more details on how to run the tests manually.

3. Additional end-to-end testing

Start a Llama Stack server with your new provider
Verify compatibility with existing client scripts in the llama-stack-apps repository
Document which scripts are compatible with your provider

Submitting Your PR

Ensure all tests pass
Include a comprehensive test plan in your PR summary
Document any known limitations or considerations

4.5 KiB Raw Blame History

Adding a New API Provider

Inference Provider Patterns

OpenAIMixin

Direct API Methods

Model Management

Client Management

Required Implementation

Testing the Provider

1. Integration Testing

2. Unit Testing

3. Additional end-to-end testing

Submitting Your PR

4.5 KiB

Raw Blame History