Merge branch 'main' into chroma

This commit is contained in:
Bwook (Byoungwook) Kim 2025-08-12 12:59:01 +09:00 committed by GitHub
commit 6b5ff1c861
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
25 changed files with 447 additions and 90 deletions

14
docs/_static/js/keyboard_shortcuts.js vendored Normal file
View file

@ -0,0 +1,14 @@
document.addEventListener('keydown', function(event) {
// command+K or ctrl+K
if ((event.metaKey || event.ctrlKey) && event.key === 'k') {
event.preventDefault();
document.querySelector('.search-input, .search-field, input[name="q"]').focus();
}
// forward slash
if (event.key === '/' &&
!event.target.matches('input, textarea, select')) {
event.preventDefault();
document.querySelector('.search-input, .search-field, input[name="q"]').focus();
}
});

View file

@ -131,6 +131,7 @@ html_static_path = ["../_static"]
def setup(app):
app.add_css_file("css/my_theme.css")
app.add_js_file("js/detect_theme.js")
app.add_js_file("js/keyboard_shortcuts.js")
def dockerhub_role(name, rawtext, text, lineno, inliner, options={}, content=[]):
url = f"https://hub.docker.com/r/llamastack/{text}"

View file

@ -2,14 +2,28 @@
```{include} ../../../CONTRIBUTING.md
```
See the [Adding a New API Provider](new_api_provider.md) which describes how to add new API providers to the Stack.
## Testing
See the [Test Page](testing.md) which describes how to test your changes.
```{toctree}
:maxdepth: 1
:hidden:
:caption: Testing
testing
```
## Adding a New Provider
See the [Adding a New API Provider Page](new_api_provider.md) which describes how to add new API providers to the Stack.
See the [Vector Database Page](new_vector_database.md) which describes how to add a new vector databases with Llama Stack.
See the [External Provider Page](../providers/external/index.md) which describes how to add external providers to the Stack.
```{toctree}
:maxdepth: 1
:hidden:
new_api_provider
testing
new_vector_database
```

View file

@ -0,0 +1,75 @@
# Adding a New Vector Database
This guide will walk you through the process of adding a new vector database to Llama Stack.
> **_NOTE:_** Here's an example Pull Request of the [Milvus Vector Database Provider](https://github.com/meta-llama/llama-stack/pull/1467).
Vector Database providers are used to store and retrieve vector embeddings. Vector databases are not limited to vector
search but can support keyword and hybrid search. Additionally, vector database can also support operations like
filtering, sorting, and aggregating vectors.
## Steps to Add a New Vector Database Provider
1. **Choose the Database Type**: Determine if your vector database is a remote service, inline, or both.
- Remote databases make requests to external services, while inline databases execute locally. Some providers support both.
2. **Implement the Provider**: Create a new provider class that inherits from `VectorDatabaseProvider` and implements the required methods.
- Implement methods for vector storage, retrieval, search, and any additional features your database supports.
- You will need to implement the following methods for `YourVectorIndex`:
- `YourVectorIndex.create()`
- `YourVectorIndex.initialize()`
- `YourVectorIndex.add_chunks()`
- `YourVectorIndex.delete_chunk()`
- `YourVectorIndex.query_vector()`
- `YourVectorIndex.query_keyword()`
- `YourVectorIndex.query_hybrid()`
- You will need to implement the following methods for `YourVectorIOAdapter`:
- `YourVectorIOAdapter.initialize()`
- `YourVectorIOAdapter.shutdown()`
- `YourVectorIOAdapter.list_vector_dbs()`
- `YourVectorIOAdapter.register_vector_db()`
- `YourVectorIOAdapter.unregister_vector_db()`
- `YourVectorIOAdapter.insert_chunks()`
- `YourVectorIOAdapter.query_chunks()`
- `YourVectorIOAdapter.delete_chunks()`
3. **Add to Registry**: Register your provider in the appropriate registry file.
- Update {repopath}`llama_stack/providers/registry/vector_io.py` to include your new provider.
```python
from llama_stack.providers.registry.specs import InlineProviderSpec
from llama_stack.providers.registry.api import Api
InlineProviderSpec(
api=Api.vector_io,
provider_type="inline::milvus",
pip_packages=["pymilvus>=2.4.10"],
module="llama_stack.providers.inline.vector_io.milvus",
config_class="llama_stack.providers.inline.vector_io.milvus.MilvusVectorIOConfig",
api_dependencies=[Api.inference],
optional_api_dependencies=[Api.files],
description="",
),
```
4. **Add Tests**: Create unit tests and integration tests for your provider in the `tests/` directory.
- Unit Tests
- By following the structure of the class methods, you will be able to easily run unit and integration tests for your database.
1. You have to configure the tests for your provide in `/tests/unit/providers/vector_io/conftest.py`.
2. Update the `vector_provider` fixture to include your provider if they are an inline provider.
3. Create a `your_vectorprovider_index` fixture that initializes your vector index.
4. Create a `your_vectorprovider_adapter` fixture that initializes your vector adapter.
5. Add your provider to the `vector_io_providers` fixture dictionary.
- Please follow the naming convention of `your_vectorprovider_index` and `your_vectorprovider_adapter` as the tests require this to execute properly.
- Integration Tests
- Integration tests are located in {repopath}`tests/integration`. These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality.
- The two set of integration tests are:
- `tests/integration/vector_io/test_vector_io.py`: This file tests registration, insertion, and retrieval.
- `tests/integration/vector_io/test_openai_vector_stores.py`: These tests are for OpenAI-compatible vector stores and test the OpenAI API compatibility.
- You will need to update `skip_if_provider_doesnt_support_openai_vector_stores` to include your provider as well as `skip_if_provider_doesnt_support_openai_vector_stores_search` to test the appropriate search functionality.
- Running the tests in the GitHub CI
- You will need to update the `.github/workflows/integration-vector-io-tests.yml` file to include your provider.
- If your provider is a remote provider, you will also have to add a container to spin up and run it in the action.
- Updating the pyproject.yml
- If you are adding tests for the `inline` provider you will have to update the `unit` group.
- `uv add new_pip_package --group unit`
- If you are adding tests for the `remote` provider you will have to update the `test` group, which is used in the GitHub CI for integration tests.
- `uv add new_pip_package --group test`
5. **Update Documentation**: Please update the documentation for end users
- Generate the provider documentation by running {repopath}`./scripts/provider_codegen.py`.
- Update the autogenerated content in the registry/vector_io.py file with information about your provider. Please see other providers for examples.

View file

@ -1,6 +1,8 @@
# Testing Llama Stack
```{include} ../../../tests/README.md
```
Tests are of three different kinds:
- Unit tests
- Provider focused integration tests
- Client SDK tests
```{include} ../../../tests/unit/README.md
```
```{include} ../../../tests/integration/README.md
```

View file

@ -29,6 +29,7 @@ remote_runpod
remote_sambanova
remote_tgi
remote_together
remote_vertexai
remote_vllm
remote_watsonx
```

View file

@ -0,0 +1,40 @@
# remote::vertexai
## Description
Google Vertex AI inference provider enables you to use Google's Gemini models through Google Cloud's Vertex AI platform, providing several advantages:
• Enterprise-grade security: Uses Google Cloud's security controls and IAM
• Better integration: Seamless integration with other Google Cloud services
• Advanced features: Access to additional Vertex AI features like model tuning and monitoring
• Authentication: Uses Google Cloud Application Default Credentials (ADC) instead of API keys
Configuration:
- Set VERTEX_AI_PROJECT environment variable (required)
- Set VERTEX_AI_LOCATION environment variable (optional, defaults to us-central1)
- Use Google Cloud Application Default Credentials or service account key
Authentication Setup:
Option 1 (Recommended): gcloud auth application-default login
Option 2: Set GOOGLE_APPLICATION_CREDENTIALS to service account key path
Available Models:
- vertex_ai/gemini-2.0-flash
- vertex_ai/gemini-2.5-flash
- vertex_ai/gemini-2.5-pro
## Configuration
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `project` | `<class 'str'>` | No | | Google Cloud project ID for Vertex AI |
| `location` | `<class 'str'>` | No | us-central1 | Google Cloud location for Vertex AI |
## Sample Configuration
```yaml
project: ${env.VERTEX_AI_PROJECT:=}
location: ${env.VERTEX_AI_LOCATION:=us-central1}
```

View file

@ -124,10 +124,7 @@ class ToolGroupsRoutingTable(CommonRoutingTableImpl, ToolGroups):
return toolgroup
async def unregister_toolgroup(self, toolgroup_id: str) -> None:
tool_group = await self.get_tool_group(toolgroup_id)
if tool_group is None:
raise ToolGroupNotFoundError(toolgroup_id)
await self.unregister_object(tool_group)
await self.unregister_object(await self.get_tool_group(toolgroup_id))
async def shutdown(self) -> None:
pass

View file

@ -14,6 +14,7 @@ distribution_spec:
- provider_type: remote::openai
- provider_type: remote::anthropic
- provider_type: remote::gemini
- provider_type: remote::vertexai
- provider_type: remote::groq
- provider_type: remote::sambanova
- provider_type: inline::sentence-transformers

View file

@ -65,6 +65,11 @@ providers:
provider_type: remote::gemini
config:
api_key: ${env.GEMINI_API_KEY:=}
- provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}
provider_type: remote::vertexai
config:
project: ${env.VERTEX_AI_PROJECT:=}
location: ${env.VERTEX_AI_LOCATION:=us-central1}
- provider_id: groq
provider_type: remote::groq
config:

View file

@ -14,6 +14,7 @@ distribution_spec:
- provider_type: remote::openai
- provider_type: remote::anthropic
- provider_type: remote::gemini
- provider_type: remote::vertexai
- provider_type: remote::groq
- provider_type: remote::sambanova
- provider_type: inline::sentence-transformers

View file

@ -65,6 +65,11 @@ providers:
provider_type: remote::gemini
config:
api_key: ${env.GEMINI_API_KEY:=}
- provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}
provider_type: remote::vertexai
config:
project: ${env.VERTEX_AI_PROJECT:=}
location: ${env.VERTEX_AI_LOCATION:=us-central1}
- provider_id: groq
provider_type: remote::groq
config:

View file

@ -56,6 +56,7 @@ ENABLED_INFERENCE_PROVIDERS = [
"fireworks",
"together",
"gemini",
"vertexai",
"groq",
"sambanova",
"anthropic",
@ -71,6 +72,7 @@ INFERENCE_PROVIDER_IDS = {
"tgi": "${env.TGI_URL:+tgi}",
"cerebras": "${env.CEREBRAS_API_KEY:+cerebras}",
"nvidia": "${env.NVIDIA_API_KEY:+nvidia}",
"vertexai": "${env.VERTEX_AI_PROJECT:+vertexai}",
}
@ -246,6 +248,14 @@ def get_distribution_template() -> DistributionTemplate:
"",
"Gemini API Key",
),
"VERTEX_AI_PROJECT": (
"",
"Google Cloud Project ID for Vertex AI",
),
"VERTEX_AI_LOCATION": (
"us-central1",
"Google Cloud Location for Vertex AI",
),
"SAMBANOVA_API_KEY": (
"",
"SambaNova API Key",

View file

@ -99,7 +99,8 @@ def parse_environment_config(env_config: str) -> dict[str, int]:
Dict[str, int]: A dictionary mapping categories to their log levels.
"""
category_levels = {}
for pair in env_config.split(";"):
delimiter = ","
for pair in env_config.split(delimiter):
if not pair.strip():
continue

View file

@ -15,6 +15,7 @@ from llama_stack.apis.safety import (
RunShieldResponse,
Safety,
SafetyViolation,
ShieldStore,
ViolationLevel,
)
from llama_stack.apis.shields import Shield
@ -32,6 +33,8 @@ PROMPT_GUARD_MODEL = "Prompt-Guard-86M"
class PromptGuardSafetyImpl(Safety, ShieldsProtocolPrivate):
shield_store: ShieldStore
def __init__(self, config: PromptGuardConfig, _deps) -> None:
self.config = config
@ -53,7 +56,7 @@ class PromptGuardSafetyImpl(Safety, ShieldsProtocolPrivate):
self,
shield_id: str,
messages: list[Message],
params: dict[str, Any] = None,
params: dict[str, Any],
) -> RunShieldResponse:
shield = await self.shield_store.get_shield(shield_id)
if not shield:
@ -117,8 +120,10 @@ class PromptGuardShield:
elif self.config.guard_type == PromptGuardType.jailbreak.value and score_malicious > self.threshold:
violation = SafetyViolation(
violation_level=ViolationLevel.ERROR,
violation_type=f"prompt_injection:malicious={score_malicious}",
violation_return_message="Sorry, I cannot do this.",
user_message="Sorry, I cannot do this.",
metadata={
"violation_type": f"prompt_injection:malicious={score_malicious}",
},
)
return RunShieldResponse(violation=violation)

View file

@ -213,6 +213,36 @@ def available_providers() -> list[ProviderSpec]:
description="Google Gemini inference provider for accessing Gemini models and Google's AI services.",
),
),
remote_provider_spec(
api=Api.inference,
adapter=AdapterSpec(
adapter_type="vertexai",
pip_packages=["litellm", "google-cloud-aiplatform"],
module="llama_stack.providers.remote.inference.vertexai",
config_class="llama_stack.providers.remote.inference.vertexai.VertexAIConfig",
provider_data_validator="llama_stack.providers.remote.inference.vertexai.config.VertexAIProviderDataValidator",
description="""Google Vertex AI inference provider enables you to use Google's Gemini models through Google Cloud's Vertex AI platform, providing several advantages:
Enterprise-grade security: Uses Google Cloud's security controls and IAM
Better integration: Seamless integration with other Google Cloud services
Advanced features: Access to additional Vertex AI features like model tuning and monitoring
Authentication: Uses Google Cloud Application Default Credentials (ADC) instead of API keys
Configuration:
- Set VERTEX_AI_PROJECT environment variable (required)
- Set VERTEX_AI_LOCATION environment variable (optional, defaults to us-central1)
- Use Google Cloud Application Default Credentials or service account key
Authentication Setup:
Option 1 (Recommended): gcloud auth application-default login
Option 2: Set GOOGLE_APPLICATION_CREDENTIALS to service account key path
Available Models:
- vertex_ai/gemini-2.0-flash
- vertex_ai/gemini-2.5-flash
- vertex_ai/gemini-2.5-pro""",
),
),
remote_provider_spec(
api=Api.inference,
adapter=AdapterSpec(

View file

@ -0,0 +1,15 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
from .config import VertexAIConfig
async def get_adapter_impl(config: VertexAIConfig, _deps):
from .vertexai import VertexAIInferenceAdapter
impl = VertexAIInferenceAdapter(config)
await impl.initialize()
return impl

View file

@ -0,0 +1,45 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
from typing import Any
from pydantic import BaseModel, Field
from llama_stack.schema_utils import json_schema_type
class VertexAIProviderDataValidator(BaseModel):
vertex_project: str | None = Field(
default=None,
description="Google Cloud project ID for Vertex AI",
)
vertex_location: str | None = Field(
default=None,
description="Google Cloud location for Vertex AI (e.g., us-central1)",
)
@json_schema_type
class VertexAIConfig(BaseModel):
project: str = Field(
description="Google Cloud project ID for Vertex AI",
)
location: str = Field(
default="us-central1",
description="Google Cloud location for Vertex AI",
)
@classmethod
def sample_run_config(
cls,
project: str = "${env.VERTEX_AI_PROJECT:=}",
location: str = "${env.VERTEX_AI_LOCATION:=us-central1}",
**kwargs,
) -> dict[str, Any]:
return {
"project": project,
"location": location,
}

View file

@ -0,0 +1,20 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
from llama_stack.providers.utils.inference.model_registry import (
ProviderModelEntry,
)
# Vertex AI model IDs with vertex_ai/ prefix as required by litellm
LLM_MODEL_IDS = [
"vertex_ai/gemini-2.0-flash",
"vertex_ai/gemini-2.5-flash",
"vertex_ai/gemini-2.5-pro",
]
SAFETY_MODELS_ENTRIES = list[ProviderModelEntry]()
MODEL_ENTRIES = [ProviderModelEntry(provider_model_id=m) for m in LLM_MODEL_IDS] + SAFETY_MODELS_ENTRIES

View file

@ -0,0 +1,52 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
from typing import Any
from llama_stack.apis.inference import ChatCompletionRequest
from llama_stack.providers.utils.inference.litellm_openai_mixin import (
LiteLLMOpenAIMixin,
)
from .config import VertexAIConfig
from .models import MODEL_ENTRIES
class VertexAIInferenceAdapter(LiteLLMOpenAIMixin):
def __init__(self, config: VertexAIConfig) -> None:
LiteLLMOpenAIMixin.__init__(
self,
MODEL_ENTRIES,
litellm_provider_name="vertex_ai",
api_key_from_config=None, # Vertex AI uses ADC, not API keys
provider_data_api_key_field="vertex_project", # Use project for validation
)
self.config = config
def get_api_key(self) -> str:
# Vertex AI doesn't use API keys, it uses Application Default Credentials
# Return empty string to let litellm handle authentication via ADC
return ""
async def _get_params(self, request: ChatCompletionRequest) -> dict[str, Any]:
# Get base parameters from parent
params = await super()._get_params(request)
# Add Vertex AI specific parameters
provider_data = self.get_request_provider_data()
if provider_data:
if getattr(provider_data, "vertex_project", None):
params["vertex_project"] = provider_data.vertex_project
if getattr(provider_data, "vertex_location", None):
params["vertex_location"] = provider_data.vertex_location
else:
params["vertex_project"] = self.config.project
params["vertex_location"] = self.config.location
# Remove api_key since Vertex AI uses ADC
params.pop("api_key", None)
return params

View file

@ -175,7 +175,7 @@ const handleSubmitWithContent = async (content: string) => {
return (
<div className="flex flex-col h-full max-w-4xl mx-auto">
<div className="mb-4 flex justify-between items-center">
<h1 className="text-2xl font-bold">Chat Playground</h1>
<h1 className="text-2xl font-bold">Chat Playground (Completions)</h1>
<div className="flex gap-2">
<Select value={selectedModel} onValueChange={setSelectedModel} disabled={isModelsLoading || isGenerating}>
<SelectTrigger className="w-[180px]">

View file

@ -6,6 +6,8 @@ import {
MoveUpRight,
Database,
MessageCircle,
Settings2,
Compass,
} from "lucide-react";
import Link from "next/link";
import { usePathname } from "next/navigation";
@ -22,15 +24,16 @@ import {
SidebarMenuItem,
SidebarHeader,
} from "@/components/ui/sidebar";
// Extracted Chat Playground item
const chatPlaygroundItem = {
title: "Chat Playground",
url: "/chat-playground",
icon: MessageCircle,
};
// Removed Chat Playground from log items
const logItems = [
const createItems = [
{
title: "Chat Playground",
url: "/chat-playground",
icon: MessageCircle,
},
];
const manageItems = [
{
title: "Chat Completions",
url: "/logs/chat-completions",
@ -53,77 +56,96 @@ const logItems = [
},
];
const optimizeItems: { title: string; url: string; icon: React.ElementType }[] = [
{
title: "Evaluations",
url: "",
icon: Compass,
},
{
title: "Fine-tuning",
url: "",
icon: Settings2,
},
];
interface SidebarItem {
title: string;
url: string;
icon: React.ElementType;
}
export function AppSidebar() {
const pathname = usePathname();
return (
<Sidebar>
<SidebarHeader>
<Link href="/">Llama Stack</Link>
</SidebarHeader>
<SidebarContent>
{/* Chat Playground as its own section */}
<SidebarGroup>
<SidebarGroupContent>
<SidebarMenu>
<SidebarMenuItem>
const renderSidebarItems = (items: SidebarItem[]) => {
return items.map((item) => {
const isActive = pathname.startsWith(item.url);
return (
<SidebarMenuItem key={item.title}>
<SidebarMenuButton
asChild
className={cn(
"justify-start",
isActive &&
"bg-gray-200 dark:bg-gray-700 hover:bg-gray-200 dark:hover:bg-gray-700 text-gray-900 dark:text-gray-100",
)}
>
<Link href={item.url}>
<item.icon
className={cn(
isActive && "text-gray-900 dark:text-gray-100",
"mr-2 h-4 w-4",
)}
/>
<span>{item.title}</span>
</Link>
</SidebarMenuButton>
</SidebarMenuItem>
);
});
};
return (
<Sidebar>
<SidebarHeader>
<Link href="/">Llama Stack</Link>
</SidebarHeader>
<SidebarContent>
<SidebarGroup>
<SidebarGroupLabel>Create</SidebarGroupLabel>
<SidebarGroupContent>
<SidebarMenu>{renderSidebarItems(createItems)}</SidebarMenu>
</SidebarGroupContent>
</SidebarGroup>
<SidebarGroup>
<SidebarGroupLabel>Manage</SidebarGroupLabel>
<SidebarGroupContent>
<SidebarMenu>{renderSidebarItems(manageItems)}</SidebarMenu>
</SidebarGroupContent>
</SidebarGroup>
<SidebarGroup>
<SidebarGroupLabel>Optimize</SidebarGroupLabel>
<SidebarGroupContent>
<SidebarMenu>
{optimizeItems.map((item) => (
<SidebarMenuItem key={item.title}>
<SidebarMenuButton
asChild
className={cn(
"justify-start",
pathname.startsWith(chatPlaygroundItem.url) &&
"bg-gray-200 dark:bg-gray-700 hover:bg-gray-200 dark:hover:bg-gray-700 text-gray-900 dark:text-gray-100",
)}
disabled
className="justify-start opacity-60 cursor-not-allowed"
>
<Link href={chatPlaygroundItem.url}>
<chatPlaygroundItem.icon
className={cn(
pathname.startsWith(chatPlaygroundItem.url) && "text-gray-900 dark:text-gray-100",
"mr-2 h-4 w-4",
)}
/>
<span>{chatPlaygroundItem.title}</span>
</Link>
<item.icon className="mr-2 h-4 w-4" />
<span>{item.title}</span>
<span className="ml-2 text-xs text-gray-500">(Coming Soon)</span>
</SidebarMenuButton>
</SidebarMenuItem>
</SidebarMenu>
</SidebarGroupContent>
</SidebarGroup>
{/* Logs section */}
<SidebarGroup>
<SidebarGroupLabel>Logs</SidebarGroupLabel>
<SidebarGroupContent>
<SidebarMenu>
{logItems.map((item) => {
const isActive = pathname.startsWith(item.url);
return (
<SidebarMenuItem key={item.title}>
<SidebarMenuButton
asChild
className={cn(
"justify-start",
isActive &&
"bg-gray-200 dark:bg-gray-700 hover:bg-gray-200 dark:hover:bg-gray-700 text-gray-900 dark:text-gray-100",
)}
>
<Link href={item.url}>
<item.icon
className={cn(
isActive && "text-gray-900 dark:text-gray-100",
"mr-2 h-4 w-4",
)}
/>
<span>{item.title}</span>
</Link>
</SidebarMenuButton>
</SidebarMenuItem>
);
})}
</SidebarMenu>
</SidebarGroupContent>
</SidebarGroup>
</SidebarContent>
</Sidebar>
))}
</SidebarMenu>
</SidebarGroupContent>
</SidebarGroup>
</SidebarContent>
</Sidebar>
);
}

View file

@ -266,7 +266,6 @@ exclude = [
"^llama_stack/providers/inline/post_training/common/validator\\.py$",
"^llama_stack/providers/inline/safety/code_scanner/",
"^llama_stack/providers/inline/safety/llama_guard/",
"^llama_stack/providers/inline/safety/prompt_guard/",
"^llama_stack/providers/inline/scoring/basic/",
"^llama_stack/providers/inline/scoring/braintrust/",
"^llama_stack/providers/inline/scoring/llm_as_judge/",

View file

@ -34,6 +34,7 @@ def skip_if_model_doesnt_support_openai_completion(client_with_models, model_id)
"remote::runpod",
"remote::sambanova",
"remote::tgi",
"remote::vertexai",
):
pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support OpenAI completions.")

View file

@ -29,6 +29,7 @@ def skip_if_model_doesnt_support_completion(client_with_models, model_id):
"remote::openai",
"remote::anthropic",
"remote::gemini",
"remote::vertexai",
"remote::groq",
"remote::sambanova",
)