mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
Implement token counting utilities to determine prompt cacheability (≥1024 tokens) with support for OpenAI, Llama, and multimodal content. - Add count_tokens() function with model-specific tokenizers - Support OpenAI models (GPT-4, GPT-4o, etc.) via tiktoken - Support Llama models (3.x, 4.x) via transformers - Fallback to character-based estimation for unknown models - Handle multimodal content (text + images) - LRU cache for tokenizer instances (max 10, <1ms cached calls) - Comprehensive unit tests (34 tests, >95% coverage) - Update tiktoken version constraint to >=0.8.0 This enables future PR to determine which prompts should be cached based on token count threshold. Signed-off-by: William Caban <william.caban@gmail.com> |
||
|---|---|---|
| .. | ||
| cli | ||
| conversations | ||
| core | ||
| distribution | ||
| files | ||
| models | ||
| prompts/prompts | ||
| providers | ||
| rag | ||
| registry | ||
| server | ||
| tools | ||
| utils | ||
| __init__.py | ||
| conftest.py | ||
| fixtures.py | ||
| README.md | ||
Llama Stack Unit Tests
Unit Tests
Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.
Prerequisites
- Python Environment: Ensure you have Python 3.12+ installed
- uv Package Manager: Install
uvif not already installed
You can run the unit tests by running:
./scripts/unit-tests.sh [PYTEST_ARGS]
Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:
./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv
If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:
source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh
Test Configuration
- Test Discovery: Tests are automatically discovered in the
tests/unit/directory - Async Support: Tests use
--asyncio-mode=autofor automatic async test handling - Coverage: Tests generate coverage reports in
htmlcov/directory - Python Version: Defaults to Python 3.12, but can be overridden with
PYTHON_VERSIONenvironment variable
Coverage Reports
After running tests, you can view coverage reports:
# Open HTML coverage report in browser
open htmlcov/index.html # macOS
xdg-open htmlcov/index.html # Linux
start htmlcov/index.html # Windows