feat: (re-)enable Databricks inference adapter

Databricks inference adapter was broken, would not start, see #3486

- remove deprecated completion / chat_completion endpoints
- enable dynamic model listing w/o refresh, listing is not async
- use SecretStr instead of str for token
- backward incompatible change: for consistency with databricks docs, env DATABRICKS_URL -> DATABRICKS_HOST and DATABRICKS_API_TOKEN -> DATABRICKS_TOKEN
- databricks urls are custom per user/org, add special recorder handling for databricks urls
- add integration test --setup databricks
- enable chat completions tests
- enable embeddings tests
- disable n > 1 tests
- disable embeddings base64 tests
- disable embeddings dimensions tests

note: reasoning models, e.g. gpt oss, fail because databricks has a custom, incompatible response format

test with: ./scripts/integration-tests.sh --stack-config server:ci-tests --setup databricks --subdirs inference --pattern openai

note: databricks needs to be manually added to the ci-tests distro for replay testing
This commit is contained in:
Matthew Farrellee 2025-09-20 05:05:05 -04:00
parent e66103c09d
commit ae804ed5a8
25 changed files with 11650 additions and 102 deletions

View file

@ -6,7 +6,7 @@
from typing import Any
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, SecretStr
from llama_stack.schema_utils import json_schema_type
@ -17,16 +17,16 @@ class DatabricksImplConfig(BaseModel):
default=None,
description="The URL for the Databricks model serving endpoint",
)
api_token: str = Field(
default=None,
api_token: SecretStr = Field(
default=SecretStr(None),
description="The Databricks API token",
)
@classmethod
def sample_run_config(
cls,
url: str = "${env.DATABRICKS_URL:=}",
api_token: str = "${env.DATABRICKS_API_TOKEN:=}",
url: str = "${env.DATABRICKS_HOST:=}",
api_token: str = "${env.DATABRICKS_TOKEN:=}",
**kwargs: Any,
) -> dict[str, Any]:
return {