llama-stack-mirror/tests/integration/providers/remote/prompts/mlflow/README.md
William Caban 0e0d311dea feat: Add MLflow Prompt Registry provider (squashed commit)
Add a new remote provider that integrates MLflow's Prompt Registry with
Llama Stack's prompts API, enabling centralized prompt management and
versioning using MLflow as the backend.

Features:
- Full implementation of Llama Stack Prompts protocol
- Support for prompt versioning and default version management
- Automatic variable extraction from Jinja2-style templates
- MLflow tag-based metadata for efficient prompt filtering
- Flexible authentication (config, environment variables, per-request)
- Bidirectional ID mapping (pmpt_<hex> ↔ llama_prompt_<hex>)
- Comprehensive error handling and validation

Implementation:
- Remote provider: src/llama_stack/providers/remote/prompts/mlflow/
- Inline reference provider: src/llama_stack/providers/inline/prompts/reference/
- MLflow 3.4+ required for Prompt Registry API support
- Deterministic ID mapping ensures consistency across conversions

Testing:
- 15 comprehensive unit tests (config validation, ID mapping)
- 18 end-to-end integration tests (full CRUD workflows)
- GitHub Actions workflow for automated CI testing with MLflow server
- Integration test fixtures with automatic server setup

Documentation:
- Complete provider configuration reference
- Setup and usage examples with code samples
- Authentication options and security best practices

Signed-off-by: William Caban <william.caban@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 09:42:50 -05:00

6.2 KiB

MLflow Prompts Provider - Integration Tests

This directory contains integration tests for the MLflow Prompts Provider. These tests require a running MLflow server.

Prerequisites

  1. MLflow installed: pip install 'mlflow>=3.4.0' (or uv pip install 'mlflow>=3.4.0')
  2. MLflow server running: See setup instructions below
  3. Test dependencies: uv sync --group test

Quick Start

1. Start MLflow Server

# Start MLflow server on localhost:5555
mlflow server --host 127.0.0.1 --port 5555

# Keep this terminal open - server will continue running

2. Run Integration Tests

In a separate terminal:

# Set MLflow URI (optional - defaults to localhost:5555)
export MLFLOW_TRACKING_URI=http://localhost:5555

# Run all integration tests
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

# Run specific test
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/test_end_to_end.py::TestMLflowPromptsEndToEnd::test_create_and_retrieve_prompt

3. Run Manual Test Script (Optional)

For quick validation without pytest:

# Run manual test script
uv run python scripts/test_mlflow_prompts_manual.py

# View output in MLflow UI
open http://localhost:5555

Test Organization

Integration Tests (test_end_to_end.py)

Comprehensive end-to-end tests covering:

  • Create and retrieve prompts
  • Update prompts (version management)
  • List prompts (default versions only)
  • List all versions of a prompt
  • Set default version
  • Variable auto-extraction
  • Variable validation
  • Error handling (not found, wrong version, etc.)
  • Complex templates with multiple variables
  • Edge cases (empty templates, no variables, etc.)

Total: 17 test scenarios

Manual Test Script (scripts/test_mlflow_prompts_manual.py)

Interactive test script with verbose output for:

  • Server connectivity check
  • Provider initialization
  • Basic CRUD operations
  • Variable extraction
  • Statistics retrieval

Configuration

MLflow Server Options

Local (default):

mlflow server --host 127.0.0.1 --port 5555

Remote server:

export MLFLOW_TRACKING_URI=http://mlflow.example.com:5000
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Databricks:

export MLFLOW_TRACKING_URI=databricks
export MLFLOW_REGISTRY_URI=databricks://profile
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Test Timeout

Tests have a default timeout of 30 seconds per MLflow operation. Adjust in conftest.py:

MLflowPromptsConfig(
    mlflow_tracking_uri=mlflow_tracking_uri,
    timeout_seconds=60,  # Increase for slow connections
)

Fixtures

mlflow_adapter

Basic adapter for simple tests:

async def test_something(mlflow_adapter):
    prompt = await mlflow_adapter.create_prompt(...)
    # Test continues...

mlflow_adapter_with_cleanup

Adapter with automatic cleanup tracking:

async def test_something(mlflow_adapter_with_cleanup):
    # Creates are tracked and attempted cleanup on teardown
    prompt = await mlflow_adapter_with_cleanup.create_prompt(...)

Note: MLflow doesn't support deletion, so cleanup is best-effort.

Troubleshooting

Server Not Available

Symptom:

SKIPPED [1] conftest.py:35: MLflow server not available at http://localhost:5555

Solution:

# Start MLflow server
mlflow server --host 127.0.0.1 --port 5555

# Verify it's running
curl http://localhost:5555/health

Connection Timeout

Symptom:

requests.exceptions.Timeout: ...

Solutions:

  1. Check MLflow server is responsive: curl http://localhost:5555/health
  2. Increase timeout in conftest.py: timeout_seconds=60
  3. Check firewall/network settings

Import Errors

Symptom:

ModuleNotFoundError: No module named 'mlflow'

Solution:

uv pip install 'mlflow>=3.4.0'

Permission Errors

Symptom:

PermissionError: [Errno 13] Permission denied: '...'

Solution:

  • Ensure MLflow has write access to its storage directory
  • Check file permissions on mlruns/ directory

Test Isolation Issues

Issue: Tests may interfere with each other if using same prompt IDs

Solution: Each test creates new prompts with unique IDs (generated by Prompt.generate_prompt_id()). If needed, use mlflow_adapter_with_cleanup fixture.

Viewing Results

MLflow UI

  1. Start MLflow server (if not already running):

    mlflow server --host 127.0.0.1 --port 5555
    
  2. Open in browser:

    http://localhost:5555
    
  3. Navigate to experiment test-llama-stack-prompts

  4. View registered prompts and their versions

Test Output

Run with verbose output to see detailed test execution:

uv run --group test pytest -vv tests/integration/providers/remote/prompts/mlflow/

CI/CD Integration

To run tests in CI/CD pipelines:

# Example GitHub Actions workflow
- name: Start MLflow server
  run: |
    mlflow server --host 127.0.0.1 --port 5555 &
    sleep 5  # Wait for server to start

- name: Wait for MLflow
  run: |
    timeout 30 bash -c 'until curl -s http://localhost:5555/health; do sleep 1; done'

- name: Run integration tests
  env:
    MLFLOW_TRACKING_URI: http://localhost:5555
  run: |
    uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/

Performance

Expected Test Duration

  • Individual test: ~1-5 seconds
  • Full suite (17 tests): ~30-60 seconds
  • Manual script: ~10-15 seconds

Optimization Tips

  1. Use local MLflow server (faster than remote)
  2. Run tests in parallel (if safe):
    uv run --group test pytest -n auto tests/integration/providers/remote/prompts/mlflow/
    
  3. Skip integration tests in development:
    uv run --group dev pytest -sv tests/unit/
    

Coverage

Integration tests provide coverage for:

  • Real MLflow API calls
  • Network communication
  • Serialization/deserialization
  • MLflow server responses
  • Version management
  • Alias handling
  • Tag storage and retrieval

Combined with unit tests, achieves >95% code coverage.