Commit graph

7 commits

Author SHA1 Message Date
Matthew Farrellee
71d67a983e chore: make all remote inference provider configs RemoteInferenceProviderConfigs 2025-10-03 07:50:52 -04:00
Matthew Farrellee
d07ebce4d9
feat: (re-)enable Databricks inference adapter (#3500)
# What does this PR do?

add/enable the Databricks inference adapter

Databricks inference adapter was broken, closes #3486 

- remove deprecated completion / chat_completion endpoints
- enable dynamic model listing w/o refresh, listing is not async
- use SecretStr instead of str for token
- backward incompatible change: for consistency with databricks docs,
env DATABRICKS_URL -> DATABRICKS_HOST and DATABRICKS_API_TOKEN ->
DATABRICKS_TOKEN
- databricks urls are custom per user/org, add special recorder handling
for databricks urls
- add integration test --setup databricks
- enable chat completions tests
- enable embeddings tests
- disable n > 1 tests
- disable embeddings base64 tests
- disable embeddings dimensions tests

note: reasoning models, e.g. gpt oss, fail because databricks has a
custom, incompatible response format

## Test Plan

ci and 

```
./scripts/integration-tests.sh --stack-config server:ci-tests --setup databricks --subdirs inference --pattern openai
```

note: databricks needs to be manually added to the ci-tests distro for
replay testing
2025-09-23 15:37:23 -04:00
Ashwin Bharambe
9583f468f8
feat(starter)!: simplify starter distro; litellm model registry changes (#2916) 2025-07-25 15:02:04 -07:00
Ihar Hrachyshka
9e6561a1ec
chore: enable pyupgrade fixes (#1806)
# What does this PR do?

The goal of this PR is code base modernization.

Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)

Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-05-01 14:23:50 -07:00
Ashwin Bharambe
d072b5fa0c
test: add unit test to ensure all config types are instantiable (#1601) 2025-03-12 22:29:58 -07:00
Ashwin Bharambe
314ee09ae3
chore: move all Llama Stack types from llama-models to llama-stack (#1098)
llama-models should have extremely minimal cruft. Its sole purpose
should be didactic -- show the simplest implementation of the llama
models and document the prompt formats, etc.

This PR is the complement to
https://github.com/meta-llama/llama-models/pull/279

## Test Plan

Ensure all `llama` CLI `model` sub-commands work:

```bash
llama model list
llama model download --model-id ...
llama model prompt-format -m ...
```

Ran tests:
```bash
cd tests/client-sdk
LLAMA_STACK_CONFIG=fireworks pytest -s -v inference/
LLAMA_STACK_CONFIG=fireworks pytest -s -v vector_io/
LLAMA_STACK_CONFIG=fireworks pytest -s -v agents/
```

Create a fresh venv `uv venv && source .venv/bin/activate` and run
`llama stack build --template fireworks --image-type venv` followed by
`llama stack run together --image-type venv` <-- the server runs

Also checked that the OpenAPI generator can run and there is no change
in the generated files as a result.

```bash
cd docs/openapi_generator
sh run_openapi_generator.sh
```
2025-02-14 09:10:59 -08:00
Ashwin Bharambe
994732e2e0
impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00
Renamed from llama_stack/providers/adapters/inference/databricks/config.py (Browse further)