Extract API definitions, models, and provider specifications into a
standalone llama-stack-api package that can be published to PyPI
independently of the main llama-stack server.
Motivation
External providers currently import from llama-stack, which overrides
the installed version and causes dependency conflicts. This separation
allows external providers to:
- Install only the type definitions they need without server dependencies
- Avoid version conflicts with the installed llama-stack package
- Be versioned and released independently
This enables us to re-enable external provider module tests that were
previously blocked by these import conflicts.
Changes
- Created llama-stack-api package with minimal dependencies (pydantic, jsonschema)
- Moved APIs, providers datatypes, strong_typing, and schema_utils
- Updated all imports from llama_stack.* to llama_stack_api.*
- Preserved git history using git mv for moved files
- Configured local editable install for development workflow
- Updated linting and type-checking configuration for both packages
- Rebased on top of upstream src/ layout changes
Testing
Package builds successfully and can be imported independently.
All pre-commit hooks pass with expected exclusions maintained.
Next Steps
- Publish llama-stack-api to PyPI
- Update external provider dependencies
- Re-enable external provider module tests
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
Add CodeScanner implementations
## Test Plan
`SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v
tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
This PR need to land after this
https://github.com/meta-llama/llama-stack/pull/3098
# What does this PR do?
This PR adds Open AI Compatible moderations api. Currently only
implementing for llama guard safety provider
Image support, expand to other safety providers and Deprecation of
run_shield will be next steps.
## Test Plan
Added 2 new tests for safe/ unsafe text prompt examples for the new open
ai compatible moderations api usage
`SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest
-v tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
(Had some issue with previous PR
https://github.com/meta-llama/llama-stack/pull/2994 while updating and
accidentally close it , reopened new one )
You now run the integration tests with these options:
```bash
Custom options:
--stack-config=STACK_CONFIG
a 'pointer' to the stack. this can be either be:
(a) a template name like `fireworks`, or
(b) a path to a run.yaml file, or
(c) an adhoc config spec, e.g.
`inference=fireworks,safety=llama-guard,agents=meta-
reference`
--env=ENV Set environment variables, e.g. --env KEY=value
--text-model=TEXT_MODEL
comma-separated list of text models. Fixture name:
text_model_id
--vision-model=VISION_MODEL
comma-separated list of vision models. Fixture name:
vision_model_id
--embedding-model=EMBEDDING_MODEL
comma-separated list of embedding models. Fixture name:
embedding_model_id
--safety-shield=SAFETY_SHIELD
comma-separated list of safety shields. Fixture name:
shield_id
--judge-model=JUDGE_MODEL
comma-separated list of judge models. Fixture name:
judge_model_id
--embedding-dimension=EMBEDDING_DIMENSION
Output dimensionality of the embedding model to use for
testing. Default: 384
--record-responses Record new API responses instead of using cached ones.
--report=REPORT Path where the test report should be written, e.g.
--report=/path/to/report.md
```
Importantly, if you don't specify any of the models (text-model,
vision-model, etc.) the relevant tests will get **skipped!**
This will make running tests somewhat more annoying since all options
will need to be specified. We will make this easier by adding some easy
wrapper yaml configs.
## Test Plan
Example:
```bash
ashwin@ashwin-mbp ~/local/llama-stack/tests/integration (unify_tests) $
LLAMA_STACK_CONFIG=fireworks pytest -s -v inference/test_text_inference.py \
--text-model meta-llama/Llama-3.2-3B-Instruct
```