# What does this PR do?
the build.yaml is only used in the following ways:
1. list-deps
2. distribution code-gen
since `llama stack build` no longer exists, I found myself asking "why
do we need two different files for list-deps and run"?
Removing the BuildConfig and altering the usage of the
DistributionTemplate in llama stack list-deps is the first step in
removing the build yaml entirely.
Removing the BuildConfig and build.yaml cuts the files users need to
maintain in half, and allows us to focus on the stability of _just_ the
run.yaml
This PR removes the build.yaml, BuildConfig datatype, and its usage
throughout the codebase. Users are now expected to point to run.yaml
files when running list-deps, and our codebase automatically uses these
types now for things like `get_provider_registry`.
**Additionally, two renames: `StackRunConfig` -> `StackConfig` and
`run.yaml` -> `config.yaml`.**
The build.yaml made sense for when we were managing the build process
for the user and actually _producing_ a run.yaml _from_ the build.yaml,
but now that we are simply just getting the provider registry and
listing the deps, switching to config.yaml simplifies the scope here
greatly.
## Test Plan
existing list-deps usage should work in the tests.
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
These primitives (used both by the Stack as well as provider
implementations) can be thought of fruitfully as internal-only APIs
which can themselves have multiple implementations. We use the new
`llama_stack_api.internal` namespace for this.
In addition: the change moves kv/sql store impls, configs, and
dependency helpers under `core/storage`
## Testing
`pytest tests/unit/utils/test_authorized_sqlstore.py`, other existing CI
# What does this PR do?
the directory structure was src/llama-stack-api/llama_stack_api
instead it should just be src/llama_stack_api to match the other
packages.
update the structure and pyproject/linting config
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
Extract API definitions and provider specifications into a standalone
llama-stack-api package that can be published to PyPI independently of
the main llama-stack server.
see: https://github.com/llamastack/llama-stack/pull/2978 and
https://github.com/llamastack/llama-stack/pull/2978#issuecomment-3145115942
Motivation
External providers currently import from llama-stack, which overrides
the installed version and causes dependency conflicts. This separation
allows external providers to:
- Install only the type definitions they need without server
dependencies
- Avoid version conflicts with the installed llama-stack package
- Be versioned and released independently
This enables us to re-enable external provider module tests that were
previously blocked by these import conflicts.
Changes
- Created llama-stack-api package with minimal dependencies (pydantic,
jsonschema)
- Moved APIs, providers datatypes, strong_typing, and schema_utils
- Updated all imports from llama_stack.* to llama_stack_api.*
- Configured local editable install for development workflow
- Updated linting and type-checking configuration for both packages
Next Steps
- Publish llama-stack-api to PyPI
- Update external provider dependencies
- Re-enable external provider module tests
Pre-cursor PRs to this one:
- #4093
- #3954
- #4064
These PRs moved key pieces _out_ of the Api pkg, limiting the scope of
change here.
relates to #3237
## Test Plan
Package builds successfully and can be imported independently. All
pre-commit hooks pass with expected exclusions maintained.
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
**This PR changes configurations in a backward incompatible way.**
Run configs today repeat full SQLite/Postgres snippets everywhere a
store is needed, which means duplicated credentials, extra connection
pools, and lots of drift between files. This PR introduces named storage
backends so the stack and providers can share a single catalog and
reference those backends by name.
## Key Changes
- Add `storage.backends` to `StackRunConfig`, register each KV/SQL
backend once at startup, and validate that references point to the right
family.
- Move server stores under `storage.stores` with lightweight references
(backend + namespace/table) instead of full configs.
- Update every provider/config/doc to use the new reference style;
docs/codegen now surface the simplified YAML.
## Migration
Before:
```yaml
metadata_store:
type: sqlite
db_path: ~/.llama/distributions/foo/registry.db
inference_store:
type: postgres
host: ${env.POSTGRES_HOST}
port: ${env.POSTGRES_PORT}
db: ${env.POSTGRES_DB}
user: ${env.POSTGRES_USER}
password: ${env.POSTGRES_PASSWORD}
conversations_store:
type: postgres
host: ${env.POSTGRES_HOST}
port: ${env.POSTGRES_PORT}
db: ${env.POSTGRES_DB}
user: ${env.POSTGRES_USER}
password: ${env.POSTGRES_PASSWORD}
```
After:
```yaml
storage:
backends:
kv_default:
type: kv_sqlite
db_path: ~/.llama/distributions/foo/kvstore.db
sql_default:
type: sql_postgres
host: ${env.POSTGRES_HOST}
port: ${env.POSTGRES_PORT}
db: ${env.POSTGRES_DB}
user: ${env.POSTGRES_USER}
password: ${env.POSTGRES_PASSWORD}
stores:
metadata:
backend: kv_default
namespace: registry
inference:
backend: sql_default
table_name: inference_store
max_write_queue_size: 10000
num_writers: 4
conversations:
backend: sql_default
table_name: openai_conversations
```
Provider configs follow the same pattern—for example, a Chroma vector
adapter switches from:
```yaml
providers:
vector_io:
- provider_id: chromadb
provider_type: remote::chromadb
config:
url: ${env.CHROMADB_URL}
kvstore:
type: sqlite
db_path: ~/.llama/distributions/foo/chroma.db
```
to:
```yaml
providers:
vector_io:
- provider_id: chromadb
provider_type: remote::chromadb
config:
url: ${env.CHROMADB_URL}
persistence:
backend: kv_default
namespace: vector_io::chroma_remote
```
Once the backends are declared, everything else just points at them, so
rotating credentials or swapping to Postgres happens in one place and
the stack reuses a single connection pool.
# What does this PR do?
previously, developers who ran `./scripts/unit-tests.sh` would get
`asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and
`@pytest_asyncio.fixture` were redundent. developers who ran `pytest`
directly would get pytest's default (strict mode), would run into errors
leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture`
to their code.
with this change -
- `asyncio_mode=auto` is included in `pyproject.toml` making behavior
consistent for all invocations of pytest
- removes all redundant `@pytest_asyncio.fixture` and
`@pytest.mark.asyncio`
- for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0`
## Test Plan
- `./scripts/unit-tests.sh`
- `uv run pytest tests/unit`
This allows a set of rules to be defined for determining access to
resources. The rules are (loosely) based on the cedar policy format.
A rule defines a list of action either to permit or to forbid. It may
specify a principal or a resource that must match for the rule to take
effect. It may also specify a condition, either a 'when' or an 'unless',
with additional constraints as to where the rule applies.
A list of rules is held for each type to be protected and tried in order
to find a match. If a match is found, the request is permitted or
forbidden depening on the type of rule. If no match is found, the
request is denied. If no rules are specified for a given type, a rule
that allows any action as long as the resource attributes match the user
attributes is added (i.e. the previous behaviour is the default.
Some examples in yaml:
```
model:
- permit:
principal: user-1
actions: [create, read, delete]
comment: user-1 has full access to all models
- permit:
principal: user-2
actions: [read]
resource: model-1
comment: user-2 has read access to model-1 only
- permit:
actions: [read]
when:
user_in: resource.namespaces
comment: any user has read access to models with matching attributes
vector_db:
- forbid:
actions: [create, read, delete]
unless:
user_in: role::admin
comment: only user with admin role can use vector_db resources
```
---------
Signed-off-by: Gordon Sim <gsim@redhat.com>
# What does this PR do?
The goal of this PR is code base modernization.
Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)
Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>