# What does this PR do?
on the path to maintainable impls of inference providers. make all
configs instances of RemoteInferenceProviderConfig.
## Test Plan
ci
# What does this PR do?
Initial implementation for `Conversations` and `ConversationItems` using
`AuthorizedSqlStore` with endpoints to:
- CREATE
- UPDATE
- GET/RETRIEVE/LIST
- DELETE
Set `level=LLAMA_STACK_API_V1`.
NOTE: This does not currently incorporate changes for Responses, that'll
be done in a subsequent PR.
Closes https://github.com/llamastack/llama-stack/issues/3235
## Test Plan
- Unit tests
- Integration tests
Also comparison of [OpenAPI spec for OpenAI
API](https://github.com/openai/openai-openapi/tree/manual_spec)
```bash
oasdiff breaking --fail-on ERR docs/static/llama-stack-spec.yaml https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/manual_spec/openapi.yaml --strip-prefix-base "/v1/openai/v1" \
--match-path '(^/v1/openai/v1/conversations.*|^/conversations.*)'
```
Note I still have some uncertainty about this, I borrowed this info from
@cdoern on https://github.com/llamastack/llama-stack/pull/3514 but need
to spend more time to confirm it's working, at the moment it suggests it
does.
UPDATE on `oasdiff`, I investigated the OpenAI spec further and it looks
like currently the spec does not list Conversations, so that analysis is
useless. Noting for future reference.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
now that we consolidated the providerspec types and got rid of
`AdapterSpec`, adjust external.md
BREAKING CHANGE: external providers must update their
`get_provider_spec` function to use `RemoteProviderSpec` properly
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
remove unused chat_completion implementations
vllm features ported -
- requires max_tokens be set, use config value
- set tool_choice to none if no tools provided
## Test Plan
ci
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- This PR implements keyword and hybrid search for Weaviate DB based on
its inbuilt functions.
- Added fixtures to conftest.py for Weaviate.
- Enabled integration tests for remote Weaviate on all 3 search modes.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#3010
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Unit tests and integration tests should pass on this PR.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Addresses Issue #3271 - "Starting LLS server locally on a terminal with
120 chars width results in an output with empty lines".
This removes the specific 150-character width limit specified for the
Console, and will now auto-detect the terminal width instead. Now the
formatting of Console output is consistent across different sizes of
terminal windows.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#3271
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Launching the server with several different sizes of terminal windows
results in Console output without unexpected spacing. e.g. `python -m
llama_stack.core.server.server /tmp/run.yaml --port 8321`
---------
Signed-off-by: Doug Edgar <dedgar@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
# What does this PR do?
add ModelsProtocolPrivate methods to OpenAIMixin
this will allow providers using OpenAIMixin to use a common interface
## Test Plan
ci w/ new tests
# What does this PR do?
closes#3268closes#3498
When resuming from previous response ID, currently we attempt to convert
from the stored responses input to chat completion messages, which is
not always possible, e.g. for tool calls where some data is lost once
converted from chat completion message to repsonses input format.
This PR stores the chat completion messages that correspond to the
_last_ call to chat completion, which is sufficient to be resumed from
in the next responses API call, where we load these saved messages and
skip conversion entirely.
Separate issue to optimize storage:
https://github.com/llamastack/llama-stack/issues/3646
## Test Plan
existing CI tests
This is a sweeping change to clean up some gunk around our "Tool"
definitions.
First, we had two types `Tool` and `ToolDef`. The first of these was a
"Resource" type for the registry but we had stopped registering tools
inside the Registry long back (and only registered ToolGroups.) The
latter was for specifying tools for the Agents API. This PR removes the
former and adds an optional `toolgroup_id` field to the latter.
Secondly, as pointed out by @bbrowning in
https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132,
we were doing a lossy conversion from a full JSON schema from the MCP
tool specification into our ToolDefinition to send it to the model.
There is no necessity to do this -- we ourselves aren't doing any
execution at all but merely passing it to the chat completions API which
supports this. By doing this (and by doing it poorly), we encountered
limitations like not supporting array items, or not resolving $refs,
etc.
To fix this, we replaced the `parameters` field by `{ input_schema,
output_schema }` which can be full blown JSON schemas.
Finally, there were some types in our llama-related chat format
conversion which needed some cleanup. We are taking this opportunity to
clean those up.
This PR is a substantial breaking change to the API. However, given our
window for introducing breaking changes, this suits us just fine. I will
be landing a concurrent `llama-stack-client` change as well since API
shapes are changing.
# What does this PR do?
* Adds stainless-llama-stack-spec.yaml for Stainless client generation,
which comprises stable + experimental APIs
## Test Plan
* Manual generation
## Description
Currently, the docs page has the home book opened by default. This PR
updates the .ts so that the sidebar books are collapsed when you first
open the webpage
# What does this PR do?
this was broken by #3631, re-enable this ability by only using oasdiff
when .skip != 'true'
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
* Updates code snippets for Dell distribution, fixing specific user home
directory in code (replacing with $HOME) and updates docker instructions
to use `docker` instead of `podman`.
## Test Plan
N.A.
Co-authored-by: Connor Hack <connorhack@fb.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Spammy
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
n/a
# What does this PR do?
the LiteLLMOpenAIMixin provides support for reading key from provider
data (headers users send).
this adds the same functionality to the OpenAIMixin.
this is infrastructure for migrating providers.
## Test Plan
ci w/ new tests
# What does this PR do?
Adds supplementary static content to root API spec pages. This is useful for giving context behind a specific API group, adding information on supported features or work in progress, etc.
This PR introduces supplementary information for Agents (experimental, deprecated) and Responses (stable) APIs.
<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
Documentation server renders rich static content for the Agents API group:

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
# What does this PR do?
First step towards cleaning up the API reference section of the docs.
- Separates API reference into 3 sections: stable (`v1`), experimental (`v1alpha` and `v1beta`), and deprecated (`deprecated=True`)
- Each section is accessible via the dropdown menu and `docs/api-overview`
<img width="1237" height="321" alt="Screenshot 2025-09-30 at 5 47 30 PM" src="https://github.com/user-attachments/assets/fe0e498c-b066-46ed-a48e-4739d3b6724c" />
<img width="860" height="510" alt="Screenshot 2025-09-30 at 5 47 49 PM" src="https://github.com/user-attachments/assets/a92a8d8c-94bf-42d5-9f5b-b47bb2b14f9c" />
- Deprecated APIs: Added styling to the sidebar, and a notice on the endpoint pages
<img width="867" height="428" alt="Screenshot 2025-09-30 at 5 47 43 PM" src="https://github.com/user-attachments/assets/9e6e050d-c782-461b-8084-5ff6496d7bd9" />
Closes#3628
TODO in follow-up PRs:
- Add the ability to annotate API groups with supplementary content (so we can have longer descriptions of complex APIs like Responses)
- Clean up docstrings to show API endpoints (or short semantic titles) in the sidebar
## Test Plan
- Local testing
- Made sure API conformance test still passes
# What does this PR do?
Given the rapidly changing nature of Llama Stack's APIs and the need to have clean, user-friendly API documentation, we want to split the API reference into 3 main buckets; stable, experimental and deprecated. The most straightforward way to do it is to have several automatically generated doctrees, which introduces some complexity in testing APIs for backwards compatibility.
This PR updates the API conformance test to handle cases where the API schema is split into several files; it does not change the testing criteria.
<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
No developer-facing changes (all existing tests should pass)
<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
# What does this PR do?
- categories like "core::server" is not recognized so it's level is not
set by 'all=debug'
- removed spammy telemetry debug logging
## Test Plan
test server launched with LLAMA_STACK_LOGGING='all=debug'
# What does this PR do?
if the PR title has `!` or the footer of the commit has `BREAKING
CHANGE:`, skip conformance. This is documented in the API leveling
proposal
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
level the following APIs, keeping their old routes around as well until
0.4.0
1. datasetio to v1beta: used primarily by eval and training. Given that
training is v1alpha, and eval is v1alpha, datasetio is likely to change
in structure as real usages of the API spin up. Register,unregister, and
iter dataset is sparsely implemented meaning the shape of that route is
likely to change.
2. telemetry to v1alpha: telemetry has been going through many changes.
for example query_metrics was not even implemented until recently and
had to change its shape to work. putting this in v1beta will allow us to
fix functionality like OTEL, sqlite, etc. The routes themselves are set,
but the structure might change a bit
Signed-off-by: Charlie Doern <cdoern@redhat.com>