mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 01:48:05 +00:00

William Caban 0e0d311dea feat: Add MLflow Prompt Registry provider (squashed commit)

Add a new remote provider that integrates MLflow's Prompt Registry with
Llama Stack's prompts API, enabling centralized prompt management and
versioning using MLflow as the backend.

Features:
- Full implementation of Llama Stack Prompts protocol
- Support for prompt versioning and default version management
- Automatic variable extraction from Jinja2-style templates
- MLflow tag-based metadata for efficient prompt filtering
- Flexible authentication (config, environment variables, per-request)
- Bidirectional ID mapping (pmpt_<hex> ↔ llama_prompt_<hex>)
- Comprehensive error handling and validation

Implementation:
- Remote provider: src/llama_stack/providers/remote/prompts/mlflow/
- Inline reference provider: src/llama_stack/providers/inline/prompts/reference/
- MLflow 3.4+ required for Prompt Registry API support
- Deterministic ID mapping ensures consistency across conversions

Testing:
- 15 comprehensive unit tests (config validation, ID mapping)
- 18 end-to-end integration tests (full CRUD workflows)
- GitHub Actions workflow for automated CI testing with MLflow server
- Integration test fixtures with automatic server setup

Documentation:
- Complete provider configuration reference
- Setup and usage examples with code samples
- Authentication options and security best practices

Signed-off-by: William Caban <william.caban@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-26 09:42:50 -05:00

6.2 KiB

Raw Blame History

MLflow Prompts Provider - Integration Tests

This directory contains integration tests for the MLflow Prompts Provider. These tests require a running MLflow server.

Prerequisites

MLflow installed: pip install 'mlflow>=3.4.0' (or uv pip install 'mlflow>=3.4.0')
MLflow server running: See setup instructions below
Test dependencies: uv sync --group test

Quick Start

1. Start MLflow Server

# Start MLflow server on localhost:5555
mlflow server --host 127.0.0.1 --port 5555

# Keep this terminal open - server will continue running

2. Run Integration Tests

In a separate terminal:

# Set MLflow URI (optional - defaults to localhost:5555)
export MLFLOW_TRACKING_URI=http://localhost:5555

# Run all integration tests
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

# Run specific test
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/test_end_to_end.py::TestMLflowPromptsEndToEnd::test_create_and_retrieve_prompt

3. Run Manual Test Script (Optional)

For quick validation without pytest:

# Run manual test script
uv run python scripts/test_mlflow_prompts_manual.py

# View output in MLflow UI
open http://localhost:5555

Test Organization

Integration Tests (`test_end_to_end.py`)

Comprehensive end-to-end tests covering:

✅ Create and retrieve prompts
✅ Update prompts (version management)
✅ List prompts (default versions only)
✅ List all versions of a prompt
✅ Set default version
✅ Variable auto-extraction
✅ Variable validation
✅ Error handling (not found, wrong version, etc.)
✅ Complex templates with multiple variables
✅ Edge cases (empty templates, no variables, etc.)

Total: 17 test scenarios

Manual Test Script (`scripts/test_mlflow_prompts_manual.py`)

Interactive test script with verbose output for:

Server connectivity check
Provider initialization
Basic CRUD operations
Variable extraction
Statistics retrieval

Configuration

MLflow Server Options

Local (default):

mlflow server --host 127.0.0.1 --port 5555

Remote server:

export MLFLOW_TRACKING_URI=http://mlflow.example.com:5000
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Databricks:

export MLFLOW_TRACKING_URI=databricks
export MLFLOW_REGISTRY_URI=databricks://profile
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Test Timeout

Tests have a default timeout of 30 seconds per MLflow operation. Adjust in conftest.py:

MLflowPromptsConfig(
    mlflow_tracking_uri=mlflow_tracking_uri,
    timeout_seconds=60,  # Increase for slow connections
)

Fixtures

`mlflow_adapter`

Basic adapter for simple tests:

async def test_something(mlflow_adapter):
    prompt = await mlflow_adapter.create_prompt(...)
    # Test continues...

`mlflow_adapter_with_cleanup`

Adapter with automatic cleanup tracking:

async def test_something(mlflow_adapter_with_cleanup):
    # Creates are tracked and attempted cleanup on teardown
    prompt = await mlflow_adapter_with_cleanup.create_prompt(...)

Note: MLflow doesn't support deletion, so cleanup is best-effort.

Troubleshooting

Server Not Available

Symptom:

SKIPPED [1] conftest.py:35: MLflow server not available at http://localhost:5555

Solution:

# Start MLflow server
mlflow server --host 127.0.0.1 --port 5555

# Verify it's running
curl http://localhost:5555/health

Connection Timeout

Symptom:

requests.exceptions.Timeout: ...

Solutions:

Check MLflow server is responsive: curl http://localhost:5555/health
Increase timeout in conftest.py: timeout_seconds=60
Check firewall/network settings

Import Errors

Symptom:

ModuleNotFoundError: No module named 'mlflow'

Solution:

uv pip install 'mlflow>=3.4.0'

Permission Errors

Symptom:

PermissionError: [Errno 13] Permission denied: '...'

Solution:

Ensure MLflow has write access to its storage directory
Check file permissions on mlruns/ directory

Test Isolation Issues

Issue: Tests may interfere with each other if using same prompt IDs

Solution: Each test creates new prompts with unique IDs (generated by Prompt.generate_prompt_id()). If needed, use mlflow_adapter_with_cleanup fixture.

Viewing Results

MLflow UI

Start MLflow server (if not already running):

mlflow server --host 127.0.0.1 --port 5555

Open in browser:
```
http://localhost:5555
```
Navigate to experiment test-llama-stack-prompts
View registered prompts and their versions

Test Output

Run with verbose output to see detailed test execution:

uv run --group test pytest -vv tests/integration/providers/remote/prompts/mlflow/

CI/CD Integration

To run tests in CI/CD pipelines:

# Example GitHub Actions workflow
- name: Start MLflow server
  run: |
    mlflow server --host 127.0.0.1 --port 5555 &
    sleep 5  # Wait for server to start

- name: Wait for MLflow
  run: |
    timeout 30 bash -c 'until curl -s http://localhost:5555/health; do sleep 1; done'

- name: Run integration tests
  env:
    MLFLOW_TRACKING_URI: http://localhost:5555
  run: |
    uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Performance

Expected Test Duration

Individual test: ~1-5 seconds
Full suite (17 tests): ~30-60 seconds
Manual script: ~10-15 seconds

Optimization Tips

Use local MLflow server (faster than remote)

Run tests in parallel (if safe):

uv run --group test pytest -n auto tests/integration/providers/remote/prompts/mlflow/

Skip integration tests in development:

uv run --group dev pytest -sv tests/unit/

Coverage

Integration tests provide coverage for:

✅ Real MLflow API calls
✅ Network communication
✅ Serialization/deserialization
✅ MLflow server responses
✅ Version management
✅ Alias handling
✅ Tag storage and retrieval

Combined with unit tests, achieves >95% code coverage.

6.2 KiB Raw Blame History

MLflow Prompts Provider - Integration Tests

Prerequisites

Quick Start

1. Start MLflow Server

2. Run Integration Tests

3. Run Manual Test Script (Optional)

Test Organization

Integration Tests (test_end_to_end.py)

Manual Test Script (scripts/test_mlflow_prompts_manual.py)

Configuration

MLflow Server Options

Test Timeout

Fixtures

mlflow_adapter

mlflow_adapter_with_cleanup

Troubleshooting

Server Not Available

Connection Timeout

Import Errors

Permission Errors

Test Isolation Issues

Viewing Results

MLflow UI

Test Output

CI/CD Integration

Performance

Expected Test Duration

Optimization Tips

Coverage

6.2 KiB

Raw Blame History

Integration Tests (`test_end_to_end.py`)

Manual Test Script (`scripts/test_mlflow_prompts_manual.py`)

`mlflow_adapter`

`mlflow_adapter_with_cleanup`