Commit graph

4 commits

Author SHA1 Message Date
Charlie Doern
85d407c2a0 feat: split API and provider specs into separate llama-stack-api pkg
Extract API definitions, models, and provider specifications into a
standalone llama-stack-api package that can be published to PyPI
independently of the main llama-stack server.

Motivation

External providers currently import from llama-stack, which overrides
the installed version and causes dependency conflicts. This separation
allows external providers to:

- Install only the type definitions they need without server dependencies
- Avoid version conflicts with the installed llama-stack package
- Be versioned and released independently

This enables us to re-enable external provider module tests that were
previously blocked by these import conflicts.

Changes

- Created llama-stack-api package with minimal dependencies (pydantic, jsonschema)
- Moved APIs, providers datatypes, strong_typing, and schema_utils
- Updated all imports from llama_stack.* to llama_stack_api.*
- Preserved git history using git mv for moved files
- Configured local editable install for development workflow
- Updated linting and type-checking configuration for both packages
- Rebased on top of upstream src/ layout changes

Testing

Package builds successfully and can be imported independently.
All pre-commit hooks pass with expected exclusions maintained.

Next Steps

- Publish llama-stack-api to PyPI
- Update external provider dependencies
- Re-enable external provider module tests

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-12 09:19:40 -05:00
Ashwin Bharambe
f49cb0b717
chore: Stack server no longer depends on llama-stack-client (#4094)
This dependency has been bothering folks for a long time (cc @leseb). We
really needed it due to "library client" which is primarily used for our
tests and is not a part of the Stack server. Anyone who needs to use the
library client can certainly install `llama-stack-client` in their
environment to make that work.

Updated the notebook references to install `llama-stack-client`
additionally when setting things up.
2025-11-07 09:54:09 -08:00
Ashwin Bharambe
cb40da210f
fix: update tests for OpenAI-style models endpoint (#4053)
The llama-stack-client now uses /`v1/openai/v1/models` which returns
OpenAI-compatible model objects with 'id' and 'custom_metadata' fields
instead of the Resource-style 'identifier' field. Updated api_recorder
to handle the new endpoint and modified tests to access model metadata
appropriately. Deleted stale model recordings for re-recording.

**NOTE: CI will be red on this one since it is dependent on
https://github.com/llamastack/llama-stack-client-python/pull/291/files
landing. I verified locally that it is green.**
2025-11-03 17:30:08 -08:00
Ashwin Bharambe
f88416ef87
fix(inference): enable routing of models with provider_data alone (#3928)
This PR enables routing of fully qualified model IDs of the form
`provider_id/model_id` even when the models are not registered with the
Stack.

Here's the situation: assume a remote inference provider which works
only when users provide their own API keys via
`X-LlamaStack-Provider-Data` header. By definition, we cannot list
models and hence update our routing registry. But because we _require_ a
provider ID in the models now, we can identify which provider to route
to and let that provider decide.

Note that we still try to look up our registry since it may have a
pre-registered alias. Just that we don't outright fail when we are not
able to look it up.

Also, updated inference router so that the responses have the _exact_
model that the request had.

## Test Plan

Added an integration test

Closes #3929

---------

Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
2025-10-28 11:16:37 -07:00