Commit graph

27 commits

Author SHA1 Message Date
Francisco Arceo
a20e8eac8c
feat: Add OpenAI Conversations API (#3429)
# What does this PR do?

Initial implementation for `Conversations` and `ConversationItems` using
`AuthorizedSqlStore` with endpoints to:
- CREATE
- UPDATE
- GET/RETRIEVE/LIST
- DELETE

Set `level=LLAMA_STACK_API_V1`.

NOTE: This does not currently incorporate changes for Responses, that'll
be done in a subsequent PR.

Closes https://github.com/llamastack/llama-stack/issues/3235

## Test Plan
- Unit tests
- Integration tests

Also comparison of [OpenAPI spec for OpenAI
API](https://github.com/openai/openai-openapi/tree/manual_spec)
```bash
oasdiff breaking --fail-on ERR docs/static/llama-stack-spec.yaml https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/manual_spec/openapi.yaml --strip-prefix-base "/v1/openai/v1" \
--match-path '(^/v1/openai/v1/conversations.*|^/conversations.*)'
```

Note I still have some uncertainty about this, I borrowed this info from
@cdoern on https://github.com/llamastack/llama-stack/pull/3514 but need
to spend more time to confirm it's working, at the moment it suggests it
does.

UPDATE on `oasdiff`, I investigated the OpenAI spec further and it looks
like currently the spec does not list Conversations, so that analysis is
useless. Noting for future reference.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-10-03 08:47:18 -07:00
Ashwin Bharambe
ef0736527d
feat(tools)!: substantial clean up of "Tool" related datatypes (#3627)
This is a sweeping change to clean up some gunk around our "Tool"
definitions.

First, we had two types `Tool` and `ToolDef`. The first of these was a
"Resource" type for the registry but we had stopped registering tools
inside the Registry long back (and only registered ToolGroups.) The
latter was for specifying tools for the Agents API. This PR removes the
former and adds an optional `toolgroup_id` field to the latter.

Secondly, as pointed out by @bbrowning in
https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132,
we were doing a lossy conversion from a full JSON schema from the MCP
tool specification into our ToolDefinition to send it to the model.
There is no necessity to do this -- we ourselves aren't doing any
execution at all but merely passing it to the chat completions API which
supports this. By doing this (and by doing it poorly), we encountered
limitations like not supporting array items, or not resolving $refs,
etc.

To fix this, we replaced the `parameters` field by `{ input_schema,
output_schema }` which can be full blown JSON schemas.

Finally, there were some types in our llama-related chat format
conversion which needed some cleanup. We are taking this opportunity to
clean those up.

This PR is a substantial breaking change to the API. However, given our
window for introducing breaking changes, this suits us just fine. I will
be landing a concurrent `llama-stack-client` change as well since API
shapes are changing.
2025-10-02 15:12:03 -07:00
Ashwin Bharambe
6afa96b0b9 fix(api): fix a mistake from #3636 which overwrote POST /responses 2025-10-02 13:03:17 -07:00
Alexey Rybak
24ee577cb0
docs: API spec generation for Stainless (#3655)
# What does this PR do?
* Adds stainless-llama-stack-spec.yaml for Stainless client generation,
which comprises stable + experimental APIs

## Test Plan
* Manual generation
2025-10-02 09:25:09 -07:00
Sébastien Han
4161102100
chore!: add double routes for v1/openai/v1 (#3636)
So that users get a warning in 0.3.0 and we remove them in 0.4.0.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-10-02 16:11:05 +02:00
Alexey Rybak
cb36b3bab1
docs: add favicon and mobile styling (#3650)
# What does this PR do?
* Adds favicon
* Replaces old llama-stack theme image 
* Adds some mobile styling

## Test Plan
* Manual testing
2025-10-02 10:42:54 +02:00
Alexey Rybak
28bbbcf2c1
docs: adding supplementary markdown content to API specs (#3632)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s
Python Package Build Test / build (3.13) (push) Failing after 2s
Python Package Build Test / build (3.12) (push) Failing after 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 8s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 45s
Pre-commit / pre-commit (push) Successful in 1m27s
# What does this PR do?

Adds supplementary static content to root API spec pages. This is useful for giving context behind a specific API group, adding information on supported features or work in progress, etc.

This PR introduces supplementary information for Agents (experimental, deprecated) and Responses (stable) APIs.

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

## Test Plan

Documentation server renders rich static content for the Agents API group:

![image.png](https://app.graphite.dev/user-attachments/assets/fc521619-0320-4a22-9409-8ee3fb57ed0e.png)

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-10-01 10:15:30 -07:00
Alexey Rybak
b6a5bccadf
docs: api separation (#3630)
# What does this PR do?

First step towards cleaning up the API reference section of the docs.

- Separates API reference into 3 sections: stable (`v1`), experimental (`v1alpha` and `v1beta`), and deprecated (`deprecated=True`)
- Each section is accessible via the dropdown menu and `docs/api-overview`

<img width="1237" height="321" alt="Screenshot 2025-09-30 at 5 47 30 PM" src="https://github.com/user-attachments/assets/fe0e498c-b066-46ed-a48e-4739d3b6724c" />

<img width="860" height="510" alt="Screenshot 2025-09-30 at 5 47 49 PM" src="https://github.com/user-attachments/assets/a92a8d8c-94bf-42d5-9f5b-b47bb2b14f9c" />

- Deprecated APIs: Added styling to the sidebar, and a notice on the endpoint pages

<img width="867" height="428" alt="Screenshot 2025-09-30 at 5 47 43 PM" src="https://github.com/user-attachments/assets/9e6e050d-c782-461b-8084-5ff6496d7bd9" />

Closes #3628

TODO in follow-up PRs:

- Add the ability to annotate API groups with supplementary content  (so we can have longer descriptions of complex APIs like Responses)
- Clean up docstrings to show API endpoints (or short semantic titles) in the sidebar

## Test Plan

- Local testing
- Made sure API conformance test still passes
2025-10-01 10:13:31 -07:00
Charlie Doern
d167101e70
feat(api): implement v1beta leveling, and additional alpha (#3594)
# What does this PR do?

level the following APIs, keeping their old routes around as well until
0.4.0

1. datasetio to v1beta: used primarily by eval and training. Given that
training is v1alpha, and eval is v1alpha, datasetio is likely to change
in structure as real usages of the API spin up. Register,unregister, and
iter dataset is sparsely implemented meaning the shape of that route is
likely to change.

2. telemetry to v1alpha: telemetry has been going through many changes.
for example query_metrics was not even implemented until recently and
had to change its shape to work. putting this in v1beta will allow us to
fix functionality like OTEL, sqlite, etc. The routes themselves are set,
but the structure might change a bit

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-10-01 09:18:11 -07:00
grs
d350e3662b
feat: add support for require_approval argument when creating response (#3608)
# What does this PR do?
This PR adds support for the require_approval on an mcp tool definition
passed to create response in the Responses API. This allows the caller
to indicate whether they want to approve calls to that server, or let
them be called without approval.

Closes #3443

## Test Plan
Tested both approval and denial.
Added automated integration test for both cases.

---------

Signed-off-by: Gordon Sim <gsim@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
2025-09-30 14:18:34 -07:00
slekkala1
cc64093ae4
feat(api): Add Vector Store File batches api stub (#3615)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 7s
Python Package Build Test / build (3.13) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 34s
Pre-commit / pre-commit (push) Successful in 1m14s
# What does this PR do?
Adding api stubs for vector store file batches apis
https://github.com/llamastack/llama-stack/issues/3533
API Ref:
https://platform.openai.com/docs/api-reference/vector-stores-file-batches

## Test Plan
CI
2025-09-30 12:07:33 -07:00
Charlie Doern
1e25a72ece
feat(api): level /agents as v1alpha (#3610)
# What does this PR do?

agents is likely to be deprecated in favor of responses. Lets level it
as alpha to indicate the lack of longterm support

keep v1 route for backwards compat.

Closes #3611

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-09-30 11:15:04 -07:00
Matthew Farrellee
cb33f45c11
chore: unpublish /inference/chat-completion (#3609)
# What does this PR do?

BREAKING CHANGE: removes /inference/chat-completion route and updates
relevant documentation

## Test Plan

🤷
2025-09-30 11:00:42 -07:00
Ashwin Bharambe
56b625d18a
feat(openai_movement)!: Change URL structures to kill /openai/v1 (part 2) (#3605) 2025-09-29 22:57:37 -07:00
Ashwin Bharambe
3a09f00cdb
feat(files): fix expires_after API shape (#3604)
This was just quite incorrect. See source here:
https://platform.openai.com/docs/api-reference/files/create
2025-09-29 21:29:15 -07:00
Ashwin Bharambe
5e7fed8bbb
feat(openai_movement): Change URL structures to kill /openai/v1 (part 1) (#3587)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
API Conformance Tests / check-schema-compatibility (push) Successful in 6s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Pre-commit / pre-commit (push) Successful in 1m19s
Test External API and Providers / test-external (venv) (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 38s
The `/v1/openai/v1` prefix is annoying and now unnecessary given our
clearer focus on how to think about the API surface.

Let's kill it for the 0.3.0 update.

To make client-side changes feasible, we will do this in two parts. This
part adds a new route (sans `/openai/v1`) so the existing client
continues to work since the server supports both.

The next PR will be client-side (Stainless) changes which I will be
making shortly.

The final PR will remove the `/openai/v1` routes. 

Note that all these changes will happen rapidly within this release
cycle. The entire set _will be backwards incompatible_.
2025-09-29 16:14:35 -07:00
slekkala1
455579a88e
fix: Remove deprecated user param in OpenAIResponseObject (#3596)
# What does this PR do?
Just removing the deprecated User param in `OpenAIResponseObject`

Closing https://github.com/llamastack/llama-stack/issues/3482

## Test Plan
CI
2025-09-29 13:55:59 -07:00
Alexey Rybak
498be131a1
docs: update image paths (#3599)
# What does this PR do?
* Updates image paths for images in docs/resources/ to proper static
image locations

## Test Plan
* `npm run build` builds documentation properly
2025-09-29 13:14:05 -07:00
Charlie Doern
aac42ddcc2
feat(api): level inference/rerank and remove experimental (#3565)
# What does this PR do?

inference/rerank is the one route in the API intended to not be
deprecated. Level it as v1alpha.

Additionally, remove `experimental` and opt to instead use `v1alpha`
which itself implies an experimental state based on the original
proposal

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-09-29 12:42:09 -07:00
Matthew Farrellee
975ead1d6a
chore(api): remove deprecated embeddings impls (#3301)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 7s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
UI Tests / ui-tests (22) (push) Successful in 39s
Pre-commit / pre-commit (push) Successful in 1m25s
# What does this PR do?

remove deprecated embeddings implementations
2025-09-29 14:45:09 -04:00
Tami Takamiya
65f7b81e98
feat: Add items and title to ToolParameter/ToolParamDefinition (#3003)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Python Package Build Test / build (3.12) (push) Failing after 17s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 19s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (push) Failing after 20s
Test External API and Providers / test-external (venv) (push) Failing after 3s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Python Package Build Test / build (3.13) (push) Failing after 16s
Unit Tests / unit-tests (3.12) (push) Failing after 16s
API Conformance Tests / check-schema-compatibility (push) Successful in 25s
UI Tests / ui-tests (22) (push) Successful in 50s
Pre-commit / pre-commit (push) Successful in 1m16s
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Add items and title to ToolParameter/ToolParamDefinition. Adding items
will resolve the issue that occurs with Gemini LLM when an MCP tool has
array-type properties.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Unite test cases will be added.

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Kai Wu <kaiwu@meta.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-09-27 11:35:29 -07:00
Matthew Farrellee
53b15725b6
chore(apis): unpublish deprecated /v1/inference apis (#3297)
# What does this PR do?

unpublish (make unavailable to users) the following apis -
 - `/v1/inference/completion`, replaced by `/v1/openai/v1/completions`
- `/v1/inference/chat-completion`, replaced by
`/v1/openai/v1/chat/completions`
 - `/v1/inference/embeddings`, replaced by `/v1/openai/v1/embeddings`
 - `/v1/inference/batch-completion`, replaced by `/v1/openai/v1/batches`
- `/v1/inference/batch-chat-completion`, replaced by
`/v1/openai/v1/batches`

note: the implementations are still available for internal use, e.g.
agents uses chat-completion.
2025-09-27 11:20:06 -07:00
Matthew Farrellee
60484c5c4e
chore(api): remove batch inference (#3261)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Test Llama Stack Build / build (push) Failing after 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 1s
API Conformance Tests / check-schema-compatibility (push) Successful in 7s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test External API and Providers / test-external (venv) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 39s
Pre-commit / pre-commit (push) Successful in 1m18s
# What does this PR do?

APIs removed:
 - POST /v1/batch-inference/completion
 - POST /v1/batch-inference/chat-completion
 - POST /v1/inference/batch-completion
 - POST /v1/inference/batch-chat-completion

note -
- batch-completion & batch-chat-completion were only implemented for
inference=inline::meta-reference
 - batch-inference were not implemented
2025-09-26 14:35:34 -07:00
Charlie Doern
c88c4ff2c6
feat: introduce API leveling, post_training, eval to v1alpha (#3449)
# What does this PR do?

Rather than have a single `LLAMA_STACK_VERSION`, we need to have a
`_V1`, `_V1ALPHA`, and `_V1BETA` constant.

This also necessitated addition of `level` to the `WebMethod` so that
routing can be handeled properly.


For backwards compat, the `v1` routes are being kept around and marked
as `deprecated`. When used, the server will log a deprecation warning.

Deprecation log:

<img width="1224" height="134" alt="Screenshot 2025-09-25 at 2 43 36 PM"
src="https://github.com/user-attachments/assets/0cc7c245-dafc-48f0-be99-269fb9a686f9"
/>

move:
1. post_training to `v1alpha` as it is under heavy development and not
near its final state
2. eval: job scheduling is not implemented. Relies heavily on the
datasetio API which is under development missing implementations of
specific routes indicating the structure of those routes might change.
Additionally eval depends on the `inference` API which is going to be
deprecated, eval will likely need a major API surface change to conform
to using completions properly

implements leveling in #3317 

note: integration tests will fail until the SDK is regenerated with
v1alpha/inference as opposed to v1/inference

## Test Plan

existing tests should pass with newly generated schema. Conformance will
also pass as these routes are not the ones we currently test for
stability

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-09-26 16:18:07 +02:00
Alexey Rybak
6101c8e015
docs: fix broken links (#3540)
# What does this PR do?

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

- Fixes broken links and Docusaurus search

Closes #3518

## Test Plan

The following should produce a clean build with no warnings and search enabled:

```
npm install
npm run gen-api-docs all
npm run build
npm run serve
```

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 14:16:31 -07:00
Alexey Rybak
610526d6d7
docs: static content migration (#3535)
# What does this PR do?

- Migrates static content from Sphinx to Docusaurus

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

## Test Plan



<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 14:08:50 -07:00
Alexey Rybak
0a7d1adfee
fix: update OpenAPI generator (#3527)
# What does this PR do?

<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

<!-- Closes #[issue-number] -->

Updates OpenAPI generator to use summaries and changed the file generation path. 

## Test Plan

- docs/openapi_generator/run_openapi_generator.sh

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
2025-09-24 13:57:27 -07:00