Add a new remote provider that integrates MLflow's Prompt Registry with Llama Stack's prompts API, enabling centralized prompt management and versioning using MLflow as the backend. Features: - Full implementation of Llama Stack Prompts protocol - Support for prompt versioning and default version management - Automatic variable extraction from Jinja2-style templates - MLflow tag-based metadata for efficient prompt filtering - Flexible authentication (config, environment variables, per-request) - Bidirectional ID mapping (pmpt_<hex> ↔ llama_prompt_<hex>) - Comprehensive error handling and validation Implementation: - Remote provider: src/llama_stack/providers/remote/prompts/mlflow/ - Inline reference provider: src/llama_stack/providers/inline/prompts/reference/ - MLflow 3.4+ required for Prompt Registry API support - Deterministic ID mapping ensures consistency across conversions Testing: - 15 comprehensive unit tests (config validation, ID mapping) - 18 end-to-end integration tests (full CRUD workflows) - GitHub Actions workflow for automated CI testing with MLflow server - Integration test fixtures with automatic server setup Documentation: - Complete provider configuration reference - Setup and usage examples with code samples - Authentication options and security best practices Signed-off-by: William Caban <william.caban@gmail.com> Co-Authored-By: Claude <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| conftest.py | ||
| README.md | ||
| test_end_to_end.py | ||
MLflow Prompts Provider - Integration Tests
This directory contains integration tests for the MLflow Prompts Provider. These tests require a running MLflow server.
Prerequisites
- MLflow installed:
pip install 'mlflow>=3.4.0'(oruv pip install 'mlflow>=3.4.0') - MLflow server running: See setup instructions below
- Test dependencies:
uv sync --group test
Quick Start
1. Start MLflow Server
# Start MLflow server on localhost:5555
mlflow server --host 127.0.0.1 --port 5555
# Keep this terminal open - server will continue running
2. Run Integration Tests
In a separate terminal:
# Set MLflow URI (optional - defaults to localhost:5555)
export MLFLOW_TRACKING_URI=http://localhost:5555
# Run all integration tests
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/
# Run specific test
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/test_end_to_end.py::TestMLflowPromptsEndToEnd::test_create_and_retrieve_prompt
3. Run Manual Test Script (Optional)
For quick validation without pytest:
# Run manual test script
uv run python scripts/test_mlflow_prompts_manual.py
# View output in MLflow UI
open http://localhost:5555
Test Organization
Integration Tests (test_end_to_end.py)
Comprehensive end-to-end tests covering:
- ✅ Create and retrieve prompts
- ✅ Update prompts (version management)
- ✅ List prompts (default versions only)
- ✅ List all versions of a prompt
- ✅ Set default version
- ✅ Variable auto-extraction
- ✅ Variable validation
- ✅ Error handling (not found, wrong version, etc.)
- ✅ Complex templates with multiple variables
- ✅ Edge cases (empty templates, no variables, etc.)
Total: 17 test scenarios
Manual Test Script (scripts/test_mlflow_prompts_manual.py)
Interactive test script with verbose output for:
- Server connectivity check
- Provider initialization
- Basic CRUD operations
- Variable extraction
- Statistics retrieval
Configuration
MLflow Server Options
Local (default):
mlflow server --host 127.0.0.1 --port 5555
Remote server:
export MLFLOW_TRACKING_URI=http://mlflow.example.com:5000
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/
Databricks:
export MLFLOW_TRACKING_URI=databricks
export MLFLOW_REGISTRY_URI=databricks://profile
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/
Test Timeout
Tests have a default timeout of 30 seconds per MLflow operation. Adjust in conftest.py:
MLflowPromptsConfig(
mlflow_tracking_uri=mlflow_tracking_uri,
timeout_seconds=60, # Increase for slow connections
)
Fixtures
mlflow_adapter
Basic adapter for simple tests:
async def test_something(mlflow_adapter):
prompt = await mlflow_adapter.create_prompt(...)
# Test continues...
mlflow_adapter_with_cleanup
Adapter with automatic cleanup tracking:
async def test_something(mlflow_adapter_with_cleanup):
# Creates are tracked and attempted cleanup on teardown
prompt = await mlflow_adapter_with_cleanup.create_prompt(...)
Note: MLflow doesn't support deletion, so cleanup is best-effort.
Troubleshooting
Server Not Available
Symptom:
SKIPPED [1] conftest.py:35: MLflow server not available at http://localhost:5555
Solution:
# Start MLflow server
mlflow server --host 127.0.0.1 --port 5555
# Verify it's running
curl http://localhost:5555/health
Connection Timeout
Symptom:
requests.exceptions.Timeout: ...
Solutions:
- Check MLflow server is responsive:
curl http://localhost:5555/health - Increase timeout in
conftest.py:timeout_seconds=60 - Check firewall/network settings
Import Errors
Symptom:
ModuleNotFoundError: No module named 'mlflow'
Solution:
uv pip install 'mlflow>=3.4.0'
Permission Errors
Symptom:
PermissionError: [Errno 13] Permission denied: '...'
Solution:
- Ensure MLflow has write access to its storage directory
- Check file permissions on
mlruns/directory
Test Isolation Issues
Issue: Tests may interfere with each other if using same prompt IDs
Solution: Each test creates new prompts with unique IDs (generated by Prompt.generate_prompt_id()). If needed, use mlflow_adapter_with_cleanup fixture.
Viewing Results
MLflow UI
-
Start MLflow server (if not already running):
mlflow server --host 127.0.0.1 --port 5555 -
Open in browser:
http://localhost:5555 -
Navigate to experiment
test-llama-stack-prompts -
View registered prompts and their versions
Test Output
Run with verbose output to see detailed test execution:
uv run --group test pytest -vv tests/integration/providers/remote/prompts/mlflow/
CI/CD Integration
To run tests in CI/CD pipelines:
# Example GitHub Actions workflow
- name: Start MLflow server
run: |
mlflow server --host 127.0.0.1 --port 5555 &
sleep 5 # Wait for server to start
- name: Wait for MLflow
run: |
timeout 30 bash -c 'until curl -s http://localhost:5555/health; do sleep 1; done'
- name: Run integration tests
env:
MLFLOW_TRACKING_URI: http://localhost:5555
run: |
uv run --group test pytest -sv tests/integration/providers/remote/prompts/mlflow/
Performance
Expected Test Duration
- Individual test: ~1-5 seconds
- Full suite (17 tests): ~30-60 seconds
- Manual script: ~10-15 seconds
Optimization Tips
- Use local MLflow server (faster than remote)
- Run tests in parallel (if safe):
uv run --group test pytest -n auto tests/integration/providers/remote/prompts/mlflow/ - Skip integration tests in development:
uv run --group dev pytest -sv tests/unit/
Coverage
Integration tests provide coverage for:
- ✅ Real MLflow API calls
- ✅ Network communication
- ✅ Serialization/deserialization
- ✅ MLflow server responses
- ✅ Version management
- ✅ Alias handling
- ✅ Tag storage and retrieval
Combined with unit tests, achieves >95% code coverage.