mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-17 15:12:35 +00:00
BREAKING CHANGE: Migrate Vector DBs to vector store ID
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
parent
cffc4edf47
commit
432ec7d20c
49 changed files with 2325 additions and 466 deletions
2
.github/workflows/pre-commit.yml
vendored
2
.github/workflows/pre-commit.yml
vendored
|
|
@ -37,7 +37,7 @@ jobs:
|
||||||
.pre-commit-config.yaml
|
.pre-commit-config.yaml
|
||||||
|
|
||||||
- name: Set up Node.js
|
- name: Set up Node.js
|
||||||
uses: actions/setup-node@39370e3970a6d050c480ffad4ff0ed4d3fdee5af # v4.1.0
|
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
|
||||||
with:
|
with:
|
||||||
node-version: '20'
|
node-version: '20'
|
||||||
cache: 'npm'
|
cache: 'npm'
|
||||||
|
|
|
||||||
2
.github/workflows/python-build-test.yml
vendored
2
.github/workflows/python-build-test.yml
vendored
|
|
@ -24,7 +24,7 @@ jobs:
|
||||||
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
|
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
|
||||||
|
|
||||||
- name: Install uv
|
- name: Install uv
|
||||||
uses: astral-sh/setup-uv@d9e0f98d3fc6adb07d1e3d37f3043649ddad06a1 # v6.5.0
|
uses: astral-sh/setup-uv@4959332f0f014c5280e7eac8b70c90cb574c9f9b # v6.6.0
|
||||||
with:
|
with:
|
||||||
python-version: ${{ matrix.python-version }}
|
python-version: ${{ matrix.python-version }}
|
||||||
activate-environment: true
|
activate-environment: true
|
||||||
|
|
|
||||||
2
.github/workflows/semantic-pr.yml
vendored
2
.github/workflows/semantic-pr.yml
vendored
|
|
@ -22,6 +22,6 @@ jobs:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
steps:
|
steps:
|
||||||
- name: Check PR Title's semantic conformance
|
- name: Check PR Title's semantic conformance
|
||||||
uses: amannn/action-semantic-pull-request@7f33ba792281b034f64e96f4c0b5496782dd3b37 # v6.1.0
|
uses: amannn/action-semantic-pull-request@48f256284bd46cdaab1048c3721360e808335d50 # v6.1.1
|
||||||
env:
|
env:
|
||||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
|
|
||||||
|
|
@ -33,7 +33,7 @@ The list of open-benchmarks we currently support:
|
||||||
- [MMMU](https://arxiv.org/abs/2311.16502) (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI)]: Benchmark designed to evaluate multimodal models.
|
- [MMMU](https://arxiv.org/abs/2311.16502) (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI)]: Benchmark designed to evaluate multimodal models.
|
||||||
|
|
||||||
|
|
||||||
You can follow this [contributing guide](https://llama-stack.readthedocs.io/en/latest/references/evals_reference/index.html#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
|
You can follow this [contributing guide](../references/evals_reference/index.md#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
|
||||||
|
|
||||||
#### Run evaluation on open-benchmarks via CLI
|
#### Run evaluation on open-benchmarks via CLI
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -35,3 +35,6 @@ device: cpu
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
[Find more detailed information here!](huggingface.md)
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -22,3 +22,4 @@ checkpoint_format: meta
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
[Find more detailed information here!](torchtune.md)
|
||||||
|
|
|
||||||
|
|
@ -88,7 +88,7 @@ Interactive pages for users to play with and explore Llama Stack API capabilitie
|
||||||
- **API Resources**: Inspect Llama Stack API resources
|
- **API Resources**: Inspect Llama Stack API resources
|
||||||
- This page allows you to inspect Llama Stack API resources (`models`, `datasets`, `memory_banks`, `benchmarks`, `shields`).
|
- This page allows you to inspect Llama Stack API resources (`models`, `datasets`, `memory_banks`, `benchmarks`, `shields`).
|
||||||
- Under the hood, it uses Llama Stack's `/<resources>/list` API to get information about each resources.
|
- Under the hood, it uses Llama Stack's `/<resources>/list` API to get information about each resources.
|
||||||
- Please visit [Core Concepts](https://llama-stack.readthedocs.io/en/latest/concepts/index.html) for more details about the resources.
|
- Please visit [Core Concepts](../../concepts/index.md) for more details about the resources.
|
||||||
|
|
||||||
### Starting the Llama Stack Playground
|
### Starting the Llama Stack Playground
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -3,7 +3,7 @@
|
||||||
Llama Stack (LLS) provides two different APIs for building AI applications with tool calling capabilities: the **Agents API** and the **OpenAI Responses API**. While both enable AI systems to use tools, and maintain full conversation history, they serve different use cases and have distinct characteristics.
|
Llama Stack (LLS) provides two different APIs for building AI applications with tool calling capabilities: the **Agents API** and the **OpenAI Responses API**. While both enable AI systems to use tools, and maintain full conversation history, they serve different use cases and have distinct characteristics.
|
||||||
|
|
||||||
```{note}
|
```{note}
|
||||||
For simple and basic inferencing, you may want to use the [Chat Completions API](https://llama-stack.readthedocs.io/en/latest/providers/index.html#chat-completions) directly, before progressing to Agents or Responses API.
|
**Note:** For simple and basic inferencing, you may want to use the [Chat Completions API](../providers/openai.md#chat-completions) directly, before progressing to Agents or Responses API.
|
||||||
```
|
```
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
@ -173,7 +173,7 @@ Both APIs demonstrate distinct strengths that make them valuable on their own fo
|
||||||
|
|
||||||
## For More Information
|
## For More Information
|
||||||
|
|
||||||
- **LLS Agents API**: For detailed information on creating and managing agents, see the [Agents documentation](https://llama-stack.readthedocs.io/en/latest/building_applications/agent.html)
|
- **LLS Agents API**: For detailed information on creating and managing agents, see the [Agents documentation](agent.md)
|
||||||
- **OpenAI Responses API**: For information on using the OpenAI-compatible responses API, see the [OpenAI API documentation](https://platform.openai.com/docs/api-reference/responses)
|
- **OpenAI Responses API**: For information on using the OpenAI-compatible responses API, see the [OpenAI API documentation](https://platform.openai.com/docs/api-reference/responses)
|
||||||
- **Chat Completions API**: For the default backend API used by Agents, see the [Chat Completions providers documentation](https://llama-stack.readthedocs.io/en/latest/providers/index.html#chat-completions)
|
- **Chat Completions API**: For the default backend API used by Agents, see the [Chat Completions providers documentation](../providers/openai.md#chat-completions)
|
||||||
- **Agent Execution Loop**: For understanding how agents process turns and steps in their execution, see the [Agent Execution Loop documentation](https://llama-stack.readthedocs.io/en/latest/building_applications/agent_execution_loop.html)
|
- **Agent Execution Loop**: For understanding how agents process turns and steps in their execution, see the [Agent Execution Loop documentation](agent_execution_loop.md)
|
||||||
|
|
|
||||||
|
|
@ -6,4 +6,4 @@ While there is a lot of flexibility to mix-and-match providers, often users will
|
||||||
|
|
||||||
**Locally Hosted Distro**: You may want to run Llama Stack on your own hardware. Typically though, you still need to use Inference via an external service. You can use providers like HuggingFace TGI, Fireworks, Together, etc. for this purpose. Or you may have access to GPUs and can run a [vLLM](https://github.com/vllm-project/vllm) or [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) instance. If you "just" have a regular desktop machine, you can use [Ollama](https://ollama.com/) for inference. To provide convenient quick access to these options, we provide a number of such pre-configured locally-hosted Distros.
|
**Locally Hosted Distro**: You may want to run Llama Stack on your own hardware. Typically though, you still need to use Inference via an external service. You can use providers like HuggingFace TGI, Fireworks, Together, etc. for this purpose. Or you may have access to GPUs and can run a [vLLM](https://github.com/vllm-project/vllm) or [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) instance. If you "just" have a regular desktop machine, you can use [Ollama](https://ollama.com/) for inference. To provide convenient quick access to these options, we provide a number of such pre-configured locally-hosted Distros.
|
||||||
|
|
||||||
**On-device Distro**: To run Llama Stack directly on an edge device (mobile phone or a tablet), we provide Distros for [iOS](https://llama-stack.readthedocs.io/en/latest/distributions/ondevice_distro/ios_sdk.html) and [Android](https://llama-stack.readthedocs.io/en/latest/distributions/ondevice_distro/android_sdk.html)
|
**On-device Distro**: To run Llama Stack directly on an edge device (mobile phone or a tablet), we provide Distros for [iOS](../distributions/ondevice_distro/ios_sdk.md) and [Android](../distributions/ondevice_distro/android_sdk.md)
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,13 @@ Here are some example PRs to help you get started:
|
||||||
- [Nvidia Inference Implementation](https://github.com/meta-llama/llama-stack/pull/355)
|
- [Nvidia Inference Implementation](https://github.com/meta-llama/llama-stack/pull/355)
|
||||||
- [Model context protocol Tool Runtime](https://github.com/meta-llama/llama-stack/pull/665)
|
- [Model context protocol Tool Runtime](https://github.com/meta-llama/llama-stack/pull/665)
|
||||||
|
|
||||||
|
## Guidelines for creating Internal or External Providers
|
||||||
|
|
||||||
|
|**Type** |Internal (In-tree) |External (out-of-tree)
|
||||||
|
|---------|-------------------|---------------------|
|
||||||
|
|**Description** |A provider that is directly in the Llama Stack code|A provider that is outside of the Llama stack core codebase but is still accessible and usable by Llama Stack.
|
||||||
|
|**Benefits** |Ability to interact with the provider with minimal additional configurations or installations| Contributors do not have to add directly to the code to create providers accessible on Llama Stack. Keep provider-specific code separate from the core Llama Stack code.
|
||||||
|
|
||||||
## Inference Provider Patterns
|
## Inference Provider Patterns
|
||||||
|
|
||||||
When implementing Inference providers for OpenAI-compatible APIs, Llama Stack provides several mixin classes to simplify development and ensure consistent behavior across providers.
|
When implementing Inference providers for OpenAI-compatible APIs, Llama Stack provides several mixin classes to simplify development and ensure consistent behavior across providers.
|
||||||
|
|
|
||||||
|
|
@ -27,7 +27,7 @@ Then, you can access the APIs like `models` and `inference` on the client and ca
|
||||||
response = client.models.list()
|
response = client.models.list()
|
||||||
```
|
```
|
||||||
|
|
||||||
If you've created a [custom distribution](https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html), you can also use the run.yaml configuration file directly:
|
If you've created a [custom distribution](building_distro.md), you can also use the run.yaml configuration file directly:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
client = LlamaStackAsLibraryClient(config_path)
|
client = LlamaStackAsLibraryClient(config_path)
|
||||||
|
|
|
||||||
|
|
@ -22,17 +22,17 @@ else
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ -z "${GITHUB_CLIENT_ID:-}" ]; then
|
if [ -z "${GITHUB_CLIENT_ID:-}" ]; then
|
||||||
echo "ERROR: GITHUB_CLIENT_ID not set. You need it for Github login to work. Refer to https://llama-stack.readthedocs.io/en/latest/deploying/index.html#kubernetes-deployment-guide"
|
echo "ERROR: GITHUB_CLIENT_ID not set. You need it for Github login to work. See the Kubernetes Deployment Guide in the Llama Stack documentation."
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ -z "${GITHUB_CLIENT_SECRET:-}" ]; then
|
if [ -z "${GITHUB_CLIENT_SECRET:-}" ]; then
|
||||||
echo "ERROR: GITHUB_CLIENT_SECRET not set. You need it for Github login to work. Refer to https://llama-stack.readthedocs.io/en/latest/deploying/index.html#kubernetes-deployment-guide"
|
echo "ERROR: GITHUB_CLIENT_SECRET not set. You need it for Github login to work. See the Kubernetes Deployment Guide in the Llama Stack documentation."
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ -z "${LLAMA_STACK_UI_URL:-}" ]; then
|
if [ -z "${LLAMA_STACK_UI_URL:-}" ]; then
|
||||||
echo "ERROR: LLAMA_STACK_UI_URL not set. Should be set to the external URL of the UI (excluding port). You need it for Github login to work. Refer to https://llama-stack.readthedocs.io/en/latest/deploying/index.html#kubernetes-deployment-guide"
|
echo "ERROR: LLAMA_STACK_UI_URL not set. Should be set to the external URL of the UI (excluding port). You need it for Github login to work. See the Kubernetes Deployment Guide in the Llama Stack documentation."
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -66,7 +66,7 @@ llama stack run starter --port 5050
|
||||||
|
|
||||||
Ensure the Llama Stack server version is the same as the Kotlin SDK Library for maximum compatibility.
|
Ensure the Llama Stack server version is the same as the Kotlin SDK Library for maximum compatibility.
|
||||||
|
|
||||||
Other inference providers: [Table](https://llama-stack.readthedocs.io/en/latest/index.html#supported-llama-stack-implementations)
|
Other inference providers: [Table](../../index.md#supported-llama-stack-implementations)
|
||||||
|
|
||||||
How to set remote localhost in Demo App: [Settings](https://github.com/meta-llama/llama-stack-client-kotlin/tree/latest-release/examples/android_app#settings)
|
How to set remote localhost in Demo App: [Settings](https://github.com/meta-llama/llama-stack-client-kotlin/tree/latest-release/examples/android_app#settings)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@
|
||||||
orphan: true
|
orphan: true
|
||||||
---
|
---
|
||||||
<!-- This file was auto-generated by distro_codegen.py, please edit source -->
|
<!-- This file was auto-generated by distro_codegen.py, please edit source -->
|
||||||
# Meta Reference Distribution
|
# Meta Reference GPU Distribution
|
||||||
|
|
||||||
```{toctree}
|
```{toctree}
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
@ -41,7 +41,7 @@ The following environment variables can be configured:
|
||||||
|
|
||||||
## Prerequisite: Downloading Models
|
## Prerequisite: Downloading Models
|
||||||
|
|
||||||
Please use `llama model list --downloaded` to check that you have llama model checkpoints downloaded in `~/.llama` before proceeding. See [installation guide](https://llama-stack.readthedocs.io/en/latest/references/llama_cli_reference/download_models.html) here to download the models. Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
|
Please use `llama model list --downloaded` to check that you have llama model checkpoints downloaded in `~/.llama` before proceeding. See [installation guide](../../references/llama_cli_reference/download_models.md) here to download the models. Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
|
||||||
|
|
||||||
```
|
```
|
||||||
$ llama model list --downloaded
|
$ llama model list --downloaded
|
||||||
|
|
|
||||||
|
|
@ -9,7 +9,6 @@ This section contains documentation for all available providers for the **post_t
|
||||||
```{toctree}
|
```{toctree}
|
||||||
:maxdepth: 1
|
:maxdepth: 1
|
||||||
|
|
||||||
inline_huggingface-cpu
|
|
||||||
inline_huggingface-gpu
|
inline_huggingface-gpu
|
||||||
inline_torchtune-cpu
|
inline_torchtune-cpu
|
||||||
inline_torchtune-gpu
|
inline_torchtune-gpu
|
||||||
|
|
|
||||||
|
|
@ -202,7 +202,7 @@ pprint(response)
|
||||||
|
|
||||||
Llama Stack offers a library of scoring functions and the `/scoring` API, allowing you to run evaluations on your pre-annotated AI application datasets.
|
Llama Stack offers a library of scoring functions and the `/scoring` API, allowing you to run evaluations on your pre-annotated AI application datasets.
|
||||||
|
|
||||||
In this example, we will work with an example RAG dataset you have built previously, label with an annotation, and use LLM-As-Judge with custom judge prompt for scoring. Please checkout our [Llama Stack Playground](https://llama-stack.readthedocs.io/en/latest/playground/index.html) for an interactive interface to upload datasets and run scorings.
|
In this example, we will work with an example RAG dataset you have built previously, label with an annotation, and use LLM-As-Judge with custom judge prompt for scoring. Please checkout our [Llama Stack Playground](../../building_applications/playground/index.md) for an interactive interface to upload datasets and run scorings.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
judge_model_id = "meta-llama/Llama-3.1-405B-Instruct-FP8"
|
judge_model_id = "meta-llama/Llama-3.1-405B-Instruct-FP8"
|
||||||
|
|
|
||||||
|
|
@ -80,7 +80,7 @@ def get_provider_dependencies(
|
||||||
normal_deps = []
|
normal_deps = []
|
||||||
special_deps = []
|
special_deps = []
|
||||||
for package in deps:
|
for package in deps:
|
||||||
if "--no-deps" in package or "--index-url" in package:
|
if any(f in package for f in ["--no-deps", "--index-url", "--extra-index-url"]):
|
||||||
special_deps.append(package)
|
special_deps.append(package)
|
||||||
else:
|
else:
|
||||||
normal_deps.append(package)
|
normal_deps.append(package)
|
||||||
|
|
|
||||||
|
|
@ -52,7 +52,6 @@ class VectorDBsRoutingTable(CommonRoutingTableImpl, VectorDBs):
|
||||||
provider_vector_db_id: str | None = None,
|
provider_vector_db_id: str | None = None,
|
||||||
vector_db_name: str | None = None,
|
vector_db_name: str | None = None,
|
||||||
) -> VectorDB:
|
) -> VectorDB:
|
||||||
provider_vector_db_id = provider_vector_db_id or vector_db_id
|
|
||||||
if provider_id is None:
|
if provider_id is None:
|
||||||
if len(self.impls_by_provider_id) > 0:
|
if len(self.impls_by_provider_id) > 0:
|
||||||
provider_id = list(self.impls_by_provider_id.keys())[0]
|
provider_id = list(self.impls_by_provider_id.keys())[0]
|
||||||
|
|
@ -69,14 +68,33 @@ class VectorDBsRoutingTable(CommonRoutingTableImpl, VectorDBs):
|
||||||
raise ModelTypeError(embedding_model, model.model_type, ModelType.embedding)
|
raise ModelTypeError(embedding_model, model.model_type, ModelType.embedding)
|
||||||
if "embedding_dimension" not in model.metadata:
|
if "embedding_dimension" not in model.metadata:
|
||||||
raise ValueError(f"Model {embedding_model} does not have an embedding dimension")
|
raise ValueError(f"Model {embedding_model} does not have an embedding dimension")
|
||||||
|
|
||||||
|
provider = self.impls_by_provider_id[provider_id]
|
||||||
|
logger.warning(
|
||||||
|
"VectorDB is being deprecated in future releases in favor of VectorStore. Please migrate your usage accordingly."
|
||||||
|
)
|
||||||
|
vector_store = await provider.openai_create_vector_store(
|
||||||
|
name=vector_db_name or vector_db_id,
|
||||||
|
embedding_model=embedding_model,
|
||||||
|
embedding_dimension=model.metadata["embedding_dimension"],
|
||||||
|
provider_id=provider_id,
|
||||||
|
provider_vector_db_id=provider_vector_db_id,
|
||||||
|
)
|
||||||
|
|
||||||
|
vector_store_id = vector_store.id
|
||||||
|
actual_provider_vector_db_id = provider_vector_db_id or vector_store_id
|
||||||
|
logger.warning(
|
||||||
|
f"Ignoring vector_db_id {vector_db_id} and using vector_store_id {vector_store_id} instead. Setting VectorDB {vector_db_id} to VectorDB.vector_db_name"
|
||||||
|
)
|
||||||
|
|
||||||
vector_db_data = {
|
vector_db_data = {
|
||||||
"identifier": vector_db_id,
|
"identifier": vector_store_id,
|
||||||
"type": ResourceType.vector_db.value,
|
"type": ResourceType.vector_db.value,
|
||||||
"provider_id": provider_id,
|
"provider_id": provider_id,
|
||||||
"provider_resource_id": provider_vector_db_id,
|
"provider_resource_id": actual_provider_vector_db_id,
|
||||||
"embedding_model": embedding_model,
|
"embedding_model": embedding_model,
|
||||||
"embedding_dimension": model.metadata["embedding_dimension"],
|
"embedding_dimension": model.metadata["embedding_dimension"],
|
||||||
"vector_db_name": vector_db_name,
|
"vector_db_name": vector_store.name,
|
||||||
}
|
}
|
||||||
vector_db = TypeAdapter(VectorDBWithOwner).validate_python(vector_db_data)
|
vector_db = TypeAdapter(VectorDBWithOwner).validate_python(vector_db_data)
|
||||||
await self.register_object(vector_db)
|
await self.register_object(vector_db)
|
||||||
|
|
|
||||||
|
|
@ -225,7 +225,10 @@ def replace_env_vars(config: Any, path: str = "") -> Any:
|
||||||
|
|
||||||
try:
|
try:
|
||||||
result = re.sub(pattern, get_env_var, config)
|
result = re.sub(pattern, get_env_var, config)
|
||||||
|
# Only apply type conversion if substitution actually happened
|
||||||
|
if result != config:
|
||||||
return _convert_string_to_proper_type(result)
|
return _convert_string_to_proper_type(result)
|
||||||
|
return result
|
||||||
except EnvVarError as e:
|
except EnvVarError as e:
|
||||||
raise EnvVarError(e.var_name, e.path) from None
|
raise EnvVarError(e.var_name, e.path) from None
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -34,7 +34,7 @@ distribution_spec:
|
||||||
telemetry:
|
telemetry:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
post_training:
|
post_training:
|
||||||
- provider_type: inline::huggingface-cpu
|
- provider_type: inline::torchtune-cpu
|
||||||
eval:
|
eval:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
datasetio:
|
datasetio:
|
||||||
|
|
|
||||||
|
|
@ -156,13 +156,10 @@ providers:
|
||||||
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/ci-tests}/trace_store.db
|
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/ci-tests}/trace_store.db
|
||||||
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
||||||
post_training:
|
post_training:
|
||||||
- provider_id: huggingface-cpu
|
- provider_id: torchtune-cpu
|
||||||
provider_type: inline::huggingface-cpu
|
provider_type: inline::torchtune-cpu
|
||||||
config:
|
config:
|
||||||
checkpoint_format: huggingface
|
checkpoint_format: meta
|
||||||
distributed_backend: null
|
|
||||||
device: cpu
|
|
||||||
dpo_output_dir: ~/.llama/distributions/ci-tests/dpo_output
|
|
||||||
eval:
|
eval:
|
||||||
- provider_id: meta-reference
|
- provider_id: meta-reference
|
||||||
provider_type: inline::meta-reference
|
provider_type: inline::meta-reference
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
---
|
---
|
||||||
orphan: true
|
orphan: true
|
||||||
---
|
---
|
||||||
# Meta Reference Distribution
|
# Meta Reference GPU Distribution
|
||||||
|
|
||||||
```{toctree}
|
```{toctree}
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
@ -29,7 +29,7 @@ The following environment variables can be configured:
|
||||||
|
|
||||||
## Prerequisite: Downloading Models
|
## Prerequisite: Downloading Models
|
||||||
|
|
||||||
Please use `llama model list --downloaded` to check that you have llama model checkpoints downloaded in `~/.llama` before proceeding. See [installation guide](https://llama-stack.readthedocs.io/en/latest/references/llama_cli_reference/download_models.html) here to download the models. Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
|
Please use `llama model list --downloaded` to check that you have llama model checkpoints downloaded in `~/.llama` before proceeding. See [installation guide](../../references/llama_cli_reference/download_models.md) here to download the models. Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
|
||||||
|
|
||||||
```
|
```
|
||||||
$ llama model list --downloaded
|
$ llama model list --downloaded
|
||||||
|
|
|
||||||
|
|
@ -35,7 +35,7 @@ distribution_spec:
|
||||||
telemetry:
|
telemetry:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
post_training:
|
post_training:
|
||||||
- provider_type: inline::torchtune-gpu
|
- provider_type: inline::huggingface-gpu
|
||||||
eval:
|
eval:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
datasetio:
|
datasetio:
|
||||||
|
|
|
||||||
|
|
@ -156,10 +156,13 @@ providers:
|
||||||
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter-gpu}/trace_store.db
|
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter-gpu}/trace_store.db
|
||||||
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
||||||
post_training:
|
post_training:
|
||||||
- provider_id: torchtune-gpu
|
- provider_id: huggingface-gpu
|
||||||
provider_type: inline::torchtune-gpu
|
provider_type: inline::huggingface-gpu
|
||||||
config:
|
config:
|
||||||
checkpoint_format: meta
|
checkpoint_format: huggingface
|
||||||
|
distributed_backend: null
|
||||||
|
device: cpu
|
||||||
|
dpo_output_dir: ~/.llama/distributions/starter-gpu/dpo_output
|
||||||
eval:
|
eval:
|
||||||
- provider_id: meta-reference
|
- provider_id: meta-reference
|
||||||
provider_type: inline::meta-reference
|
provider_type: inline::meta-reference
|
||||||
|
|
|
||||||
|
|
@ -17,6 +17,6 @@ def get_distribution_template() -> DistributionTemplate:
|
||||||
template.description = "Quick start template for running Llama Stack with several popular providers. This distribution is intended for GPU-enabled environments."
|
template.description = "Quick start template for running Llama Stack with several popular providers. This distribution is intended for GPU-enabled environments."
|
||||||
|
|
||||||
template.providers["post_training"] = [
|
template.providers["post_training"] = [
|
||||||
BuildProvider(provider_type="inline::torchtune-gpu"),
|
BuildProvider(provider_type="inline::huggingface-gpu"),
|
||||||
]
|
]
|
||||||
return template
|
return template
|
||||||
|
|
|
||||||
|
|
@ -35,7 +35,7 @@ distribution_spec:
|
||||||
telemetry:
|
telemetry:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
post_training:
|
post_training:
|
||||||
- provider_type: inline::huggingface-cpu
|
- provider_type: inline::torchtune-cpu
|
||||||
eval:
|
eval:
|
||||||
- provider_type: inline::meta-reference
|
- provider_type: inline::meta-reference
|
||||||
datasetio:
|
datasetio:
|
||||||
|
|
|
||||||
|
|
@ -156,13 +156,10 @@ providers:
|
||||||
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/trace_store.db
|
sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/trace_store.db
|
||||||
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
|
||||||
post_training:
|
post_training:
|
||||||
- provider_id: huggingface-cpu
|
- provider_id: torchtune-cpu
|
||||||
provider_type: inline::huggingface-cpu
|
provider_type: inline::torchtune-cpu
|
||||||
config:
|
config:
|
||||||
checkpoint_format: huggingface
|
checkpoint_format: meta
|
||||||
distributed_backend: null
|
|
||||||
device: cpu
|
|
||||||
dpo_output_dir: ~/.llama/distributions/starter/dpo_output
|
|
||||||
eval:
|
eval:
|
||||||
- provider_id: meta-reference
|
- provider_id: meta-reference
|
||||||
provider_type: inline::meta-reference
|
provider_type: inline::meta-reference
|
||||||
|
|
|
||||||
|
|
@ -120,7 +120,7 @@ def get_distribution_template() -> DistributionTemplate:
|
||||||
],
|
],
|
||||||
"agents": [BuildProvider(provider_type="inline::meta-reference")],
|
"agents": [BuildProvider(provider_type="inline::meta-reference")],
|
||||||
"telemetry": [BuildProvider(provider_type="inline::meta-reference")],
|
"telemetry": [BuildProvider(provider_type="inline::meta-reference")],
|
||||||
"post_training": [BuildProvider(provider_type="inline::huggingface-cpu")],
|
"post_training": [BuildProvider(provider_type="inline::torchtune-cpu")],
|
||||||
"eval": [BuildProvider(provider_type="inline::meta-reference")],
|
"eval": [BuildProvider(provider_type="inline::meta-reference")],
|
||||||
"datasetio": [
|
"datasetio": [
|
||||||
BuildProvider(provider_type="remote::huggingface"),
|
BuildProvider(provider_type="remote::huggingface"),
|
||||||
|
|
|
||||||
|
|
@ -40,8 +40,9 @@ def available_providers() -> list[ProviderSpec]:
|
||||||
InlineProviderSpec(
|
InlineProviderSpec(
|
||||||
api=Api.inference,
|
api=Api.inference,
|
||||||
provider_type="inline::sentence-transformers",
|
provider_type="inline::sentence-transformers",
|
||||||
|
# CrossEncoder depends on torchao.quantization
|
||||||
pip_packages=[
|
pip_packages=[
|
||||||
"torch torchvision --index-url https://download.pytorch.org/whl/cpu",
|
"torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu",
|
||||||
"sentence-transformers --no-deps",
|
"sentence-transformers --no-deps",
|
||||||
],
|
],
|
||||||
module="llama_stack.providers.inline.inference.sentence_transformers",
|
module="llama_stack.providers.inline.inference.sentence_transformers",
|
||||||
|
|
|
||||||
|
|
@ -13,7 +13,7 @@ from llama_stack.providers.datatypes import AdapterSpec, Api, InlineProviderSpec
|
||||||
# The CPU version is used for distributions that don't have GPU support -- they result in smaller container images.
|
# The CPU version is used for distributions that don't have GPU support -- they result in smaller container images.
|
||||||
torchtune_def = dict(
|
torchtune_def = dict(
|
||||||
api=Api.post_training,
|
api=Api.post_training,
|
||||||
pip_packages=["torchtune==0.5.0", "torchao==0.8.0", "numpy"],
|
pip_packages=["numpy"],
|
||||||
module="llama_stack.providers.inline.post_training.torchtune",
|
module="llama_stack.providers.inline.post_training.torchtune",
|
||||||
config_class="llama_stack.providers.inline.post_training.torchtune.TorchtunePostTrainingConfig",
|
config_class="llama_stack.providers.inline.post_training.torchtune.TorchtunePostTrainingConfig",
|
||||||
api_dependencies=[
|
api_dependencies=[
|
||||||
|
|
@ -23,9 +23,32 @@ torchtune_def = dict(
|
||||||
description="TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.",
|
description="TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.",
|
||||||
)
|
)
|
||||||
|
|
||||||
huggingface_def = dict(
|
|
||||||
|
def available_providers() -> list[ProviderSpec]:
|
||||||
|
return [
|
||||||
|
InlineProviderSpec(
|
||||||
|
**{ # type: ignore
|
||||||
|
**torchtune_def,
|
||||||
|
"provider_type": "inline::torchtune-cpu",
|
||||||
|
"pip_packages": (
|
||||||
|
cast(list[str], torchtune_def["pip_packages"])
|
||||||
|
+ ["torch torchtune>=0.5.0 torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu"]
|
||||||
|
),
|
||||||
|
},
|
||||||
|
),
|
||||||
|
InlineProviderSpec(
|
||||||
|
**{ # type: ignore
|
||||||
|
**torchtune_def,
|
||||||
|
"provider_type": "inline::torchtune-gpu",
|
||||||
|
"pip_packages": (
|
||||||
|
cast(list[str], torchtune_def["pip_packages"]) + ["torch torchtune>=0.5.0 torchao>=0.12.0"]
|
||||||
|
),
|
||||||
|
},
|
||||||
|
),
|
||||||
|
InlineProviderSpec(
|
||||||
api=Api.post_training,
|
api=Api.post_training,
|
||||||
pip_packages=["trl", "transformers", "peft", "datasets"],
|
provider_type="inline::huggingface-gpu",
|
||||||
|
pip_packages=["trl", "transformers", "peft", "datasets", "torch"],
|
||||||
module="llama_stack.providers.inline.post_training.huggingface",
|
module="llama_stack.providers.inline.post_training.huggingface",
|
||||||
config_class="llama_stack.providers.inline.post_training.huggingface.HuggingFacePostTrainingConfig",
|
config_class="llama_stack.providers.inline.post_training.huggingface.HuggingFacePostTrainingConfig",
|
||||||
api_dependencies=[
|
api_dependencies=[
|
||||||
|
|
@ -33,46 +56,6 @@ huggingface_def = dict(
|
||||||
Api.datasets,
|
Api.datasets,
|
||||||
],
|
],
|
||||||
description="HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.",
|
description="HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.",
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def available_providers() -> list[ProviderSpec]:
|
|
||||||
return [
|
|
||||||
InlineProviderSpec(
|
|
||||||
**{
|
|
||||||
**torchtune_def,
|
|
||||||
"provider_type": "inline::torchtune-cpu",
|
|
||||||
"pip_packages": (
|
|
||||||
cast(list[str], torchtune_def["pip_packages"])
|
|
||||||
+ ["torch torchtune==0.5.0 torchao==0.8.0 --index-url https://download.pytorch.org/whl/cpu"]
|
|
||||||
),
|
|
||||||
},
|
|
||||||
),
|
|
||||||
InlineProviderSpec(
|
|
||||||
**{
|
|
||||||
**huggingface_def,
|
|
||||||
"provider_type": "inline::huggingface-cpu",
|
|
||||||
"pip_packages": (
|
|
||||||
cast(list[str], huggingface_def["pip_packages"])
|
|
||||||
+ ["torch --index-url https://download.pytorch.org/whl/cpu"]
|
|
||||||
),
|
|
||||||
},
|
|
||||||
),
|
|
||||||
InlineProviderSpec(
|
|
||||||
**{
|
|
||||||
**torchtune_def,
|
|
||||||
"provider_type": "inline::torchtune-gpu",
|
|
||||||
"pip_packages": (
|
|
||||||
cast(list[str], torchtune_def["pip_packages"]) + ["torch torchtune==0.5.0 torchao==0.8.0"]
|
|
||||||
),
|
|
||||||
},
|
|
||||||
),
|
|
||||||
InlineProviderSpec(
|
|
||||||
**{
|
|
||||||
**huggingface_def,
|
|
||||||
"provider_type": "inline::huggingface-gpu",
|
|
||||||
"pip_packages": (cast(list[str], huggingface_def["pip_packages"]) + ["torch"]),
|
|
||||||
},
|
|
||||||
),
|
),
|
||||||
remote_provider_spec(
|
remote_provider_spec(
|
||||||
api=Api.post_training,
|
api=Api.post_training,
|
||||||
|
|
|
||||||
|
|
@ -9,7 +9,6 @@ from __future__ import annotations # for forward references
|
||||||
import hashlib
|
import hashlib
|
||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
import sqlite3
|
|
||||||
from collections.abc import Generator
|
from collections.abc import Generator
|
||||||
from contextlib import contextmanager
|
from contextlib import contextmanager
|
||||||
from enum import StrEnum
|
from enum import StrEnum
|
||||||
|
|
@ -125,28 +124,13 @@ class ResponseStorage:
|
||||||
def __init__(self, test_dir: Path):
|
def __init__(self, test_dir: Path):
|
||||||
self.test_dir = test_dir
|
self.test_dir = test_dir
|
||||||
self.responses_dir = self.test_dir / "responses"
|
self.responses_dir = self.test_dir / "responses"
|
||||||
self.db_path = self.test_dir / "index.sqlite"
|
|
||||||
|
|
||||||
self._ensure_directories()
|
self._ensure_directories()
|
||||||
self._init_database()
|
|
||||||
|
|
||||||
def _ensure_directories(self):
|
def _ensure_directories(self):
|
||||||
self.test_dir.mkdir(parents=True, exist_ok=True)
|
self.test_dir.mkdir(parents=True, exist_ok=True)
|
||||||
self.responses_dir.mkdir(exist_ok=True)
|
self.responses_dir.mkdir(exist_ok=True)
|
||||||
|
|
||||||
def _init_database(self):
|
|
||||||
with sqlite3.connect(self.db_path) as conn:
|
|
||||||
conn.execute("""
|
|
||||||
CREATE TABLE IF NOT EXISTS recordings (
|
|
||||||
request_hash TEXT PRIMARY KEY,
|
|
||||||
response_file TEXT,
|
|
||||||
endpoint TEXT,
|
|
||||||
model TEXT,
|
|
||||||
timestamp TEXT,
|
|
||||||
is_streaming BOOLEAN
|
|
||||||
)
|
|
||||||
""")
|
|
||||||
|
|
||||||
def store_recording(self, request_hash: str, request: dict[str, Any], response: dict[str, Any]):
|
def store_recording(self, request_hash: str, request: dict[str, Any], response: dict[str, Any]):
|
||||||
"""Store a request/response pair."""
|
"""Store a request/response pair."""
|
||||||
# Generate unique response filename
|
# Generate unique response filename
|
||||||
|
|
@ -169,34 +153,9 @@ class ResponseStorage:
|
||||||
f.write("\n")
|
f.write("\n")
|
||||||
f.flush()
|
f.flush()
|
||||||
|
|
||||||
# Update SQLite index
|
|
||||||
with sqlite3.connect(self.db_path) as conn:
|
|
||||||
conn.execute(
|
|
||||||
"""
|
|
||||||
INSERT OR REPLACE INTO recordings
|
|
||||||
(request_hash, response_file, endpoint, model, timestamp, is_streaming)
|
|
||||||
VALUES (?, ?, ?, ?, datetime('now'), ?)
|
|
||||||
""",
|
|
||||||
(
|
|
||||||
request_hash,
|
|
||||||
response_file,
|
|
||||||
request.get("endpoint", ""),
|
|
||||||
request.get("model", ""),
|
|
||||||
response.get("is_streaming", False),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
|
|
||||||
def find_recording(self, request_hash: str) -> dict[str, Any] | None:
|
def find_recording(self, request_hash: str) -> dict[str, Any] | None:
|
||||||
"""Find a recorded response by request hash."""
|
"""Find a recorded response by request hash."""
|
||||||
with sqlite3.connect(self.db_path) as conn:
|
response_file = f"{request_hash[:12]}.json"
|
||||||
result = conn.execute(
|
|
||||||
"SELECT response_file FROM recordings WHERE request_hash = ?", (request_hash,)
|
|
||||||
).fetchone()
|
|
||||||
|
|
||||||
if not result:
|
|
||||||
return None
|
|
||||||
|
|
||||||
response_file = result[0]
|
|
||||||
response_path = self.responses_dir / response_file
|
response_path = self.responses_dir / response_file
|
||||||
|
|
||||||
if not response_path.exists():
|
if not response_path.exists():
|
||||||
|
|
|
||||||
610
llama_stack/ui/app/chat-playground/chunk-processor.test.tsx
Normal file
610
llama_stack/ui/app/chat-playground/chunk-processor.test.tsx
Normal file
|
|
@ -0,0 +1,610 @@
|
||||||
|
import { describe, test, expect } from "@jest/globals";
|
||||||
|
|
||||||
|
// Extract the exact processChunk function implementation for testing
|
||||||
|
function createProcessChunk() {
|
||||||
|
return (chunk: unknown): { text: string | null; isToolCall: boolean } => {
|
||||||
|
const chunkObj = chunk as Record<string, unknown>;
|
||||||
|
|
||||||
|
// Helper function to check if content contains function call JSON
|
||||||
|
const containsToolCall = (content: string): boolean => {
|
||||||
|
return (
|
||||||
|
content.includes('"type": "function"') ||
|
||||||
|
content.includes('"name": "knowledge_search"') ||
|
||||||
|
content.includes('"parameters":') ||
|
||||||
|
!!content.match(/\{"type":\s*"function".*?\}/)
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
// Check if this chunk contains a tool call (function call)
|
||||||
|
let isToolCall = false;
|
||||||
|
|
||||||
|
// Check direct chunk content if it's a string
|
||||||
|
if (typeof chunk === "string") {
|
||||||
|
isToolCall = containsToolCall(chunk);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check delta structures
|
||||||
|
if (
|
||||||
|
chunkObj?.delta &&
|
||||||
|
typeof chunkObj.delta === "object" &&
|
||||||
|
chunkObj.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = chunkObj.delta as Record<string, unknown>;
|
||||||
|
if ("tool_calls" in delta) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
if (containsToolCall(delta.text)) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check event structures
|
||||||
|
if (
|
||||||
|
chunkObj?.event &&
|
||||||
|
typeof chunkObj.event === "object" &&
|
||||||
|
chunkObj.event !== null
|
||||||
|
) {
|
||||||
|
const event = chunkObj.event as Record<string, unknown>;
|
||||||
|
|
||||||
|
// Check event payload
|
||||||
|
if (
|
||||||
|
event?.payload &&
|
||||||
|
typeof event.payload === "object" &&
|
||||||
|
event.payload !== null
|
||||||
|
) {
|
||||||
|
const payload = event.payload as Record<string, unknown>;
|
||||||
|
if (typeof payload.content === "string") {
|
||||||
|
if (containsToolCall(payload.content)) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check payload delta
|
||||||
|
if (
|
||||||
|
payload?.delta &&
|
||||||
|
typeof payload.delta === "object" &&
|
||||||
|
payload.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = payload.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
if (containsToolCall(delta.text)) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check event delta
|
||||||
|
if (
|
||||||
|
event?.delta &&
|
||||||
|
typeof event.delta === "object" &&
|
||||||
|
event.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = event.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
if (containsToolCall(delta.text)) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (typeof delta.content === "string") {
|
||||||
|
if (containsToolCall(delta.content)) {
|
||||||
|
isToolCall = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// if it's a tool call, skip it (don't display in chat)
|
||||||
|
if (isToolCall) {
|
||||||
|
return { text: null, isToolCall: true };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract text content from various chunk formats
|
||||||
|
let text: string | null = null;
|
||||||
|
|
||||||
|
// Helper function to extract clean text content, filtering out function calls
|
||||||
|
const extractCleanText = (content: string): string | null => {
|
||||||
|
if (containsToolCall(content)) {
|
||||||
|
try {
|
||||||
|
// Try to parse and extract non-function call parts
|
||||||
|
const jsonMatch = content.match(
|
||||||
|
/\{"type":\s*"function"[^}]*\}[^}]*\}/
|
||||||
|
);
|
||||||
|
if (jsonMatch) {
|
||||||
|
const jsonPart = jsonMatch[0];
|
||||||
|
const parsedJson = JSON.parse(jsonPart);
|
||||||
|
|
||||||
|
// If it's a function call, extract text after JSON
|
||||||
|
if (parsedJson.type === "function") {
|
||||||
|
const textAfterJson = content
|
||||||
|
.substring(content.indexOf(jsonPart) + jsonPart.length)
|
||||||
|
.trim();
|
||||||
|
return textAfterJson || null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// If we can't parse it properly, skip the whole thing
|
||||||
|
return null;
|
||||||
|
} catch {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return content;
|
||||||
|
};
|
||||||
|
|
||||||
|
// Try direct delta text
|
||||||
|
if (
|
||||||
|
chunkObj?.delta &&
|
||||||
|
typeof chunkObj.delta === "object" &&
|
||||||
|
chunkObj.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = chunkObj.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
text = extractCleanText(delta.text);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try event structures
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
chunkObj?.event &&
|
||||||
|
typeof chunkObj.event === "object" &&
|
||||||
|
chunkObj.event !== null
|
||||||
|
) {
|
||||||
|
const event = chunkObj.event as Record<string, unknown>;
|
||||||
|
|
||||||
|
// Try event payload content
|
||||||
|
if (
|
||||||
|
event?.payload &&
|
||||||
|
typeof event.payload === "object" &&
|
||||||
|
event.payload !== null
|
||||||
|
) {
|
||||||
|
const payload = event.payload as Record<string, unknown>;
|
||||||
|
|
||||||
|
// Try direct payload content
|
||||||
|
if (typeof payload.content === "string") {
|
||||||
|
text = extractCleanText(payload.content);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try turn_complete event structure: payload.turn.output_message.content
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
payload?.turn &&
|
||||||
|
typeof payload.turn === "object" &&
|
||||||
|
payload.turn !== null
|
||||||
|
) {
|
||||||
|
const turn = payload.turn as Record<string, unknown>;
|
||||||
|
if (
|
||||||
|
turn?.output_message &&
|
||||||
|
typeof turn.output_message === "object" &&
|
||||||
|
turn.output_message !== null
|
||||||
|
) {
|
||||||
|
const outputMessage = turn.output_message as Record<
|
||||||
|
string,
|
||||||
|
unknown
|
||||||
|
>;
|
||||||
|
if (typeof outputMessage.content === "string") {
|
||||||
|
text = extractCleanText(outputMessage.content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fallback to model_response in steps if no output_message
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
turn?.steps &&
|
||||||
|
Array.isArray(turn.steps) &&
|
||||||
|
turn.steps.length > 0
|
||||||
|
) {
|
||||||
|
for (const step of turn.steps) {
|
||||||
|
if (step && typeof step === "object" && step !== null) {
|
||||||
|
const stepObj = step as Record<string, unknown>;
|
||||||
|
if (
|
||||||
|
stepObj?.model_response &&
|
||||||
|
typeof stepObj.model_response === "object" &&
|
||||||
|
stepObj.model_response !== null
|
||||||
|
) {
|
||||||
|
const modelResponse = stepObj.model_response as Record<
|
||||||
|
string,
|
||||||
|
unknown
|
||||||
|
>;
|
||||||
|
if (typeof modelResponse.content === "string") {
|
||||||
|
text = extractCleanText(modelResponse.content);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try payload delta
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
payload?.delta &&
|
||||||
|
typeof payload.delta === "object" &&
|
||||||
|
payload.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = payload.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
text = extractCleanText(delta.text);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try event delta
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
event?.delta &&
|
||||||
|
typeof event.delta === "object" &&
|
||||||
|
event.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = event.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.text === "string") {
|
||||||
|
text = extractCleanText(delta.text);
|
||||||
|
}
|
||||||
|
if (!text && typeof delta.content === "string") {
|
||||||
|
text = extractCleanText(delta.content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try choices structure (ChatML format)
|
||||||
|
if (
|
||||||
|
!text &&
|
||||||
|
chunkObj?.choices &&
|
||||||
|
Array.isArray(chunkObj.choices) &&
|
||||||
|
chunkObj.choices.length > 0
|
||||||
|
) {
|
||||||
|
const choice = chunkObj.choices[0] as Record<string, unknown>;
|
||||||
|
if (
|
||||||
|
choice?.delta &&
|
||||||
|
typeof choice.delta === "object" &&
|
||||||
|
choice.delta !== null
|
||||||
|
) {
|
||||||
|
const delta = choice.delta as Record<string, unknown>;
|
||||||
|
if (typeof delta.content === "string") {
|
||||||
|
text = extractCleanText(delta.content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try direct string content
|
||||||
|
if (!text && typeof chunk === "string") {
|
||||||
|
text = extractCleanText(chunk);
|
||||||
|
}
|
||||||
|
|
||||||
|
return { text, isToolCall: false };
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe("Chunk Processor", () => {
|
||||||
|
const processChunk = createProcessChunk();
|
||||||
|
|
||||||
|
describe("Real Event Structures", () => {
|
||||||
|
test("handles turn_complete event with cancellation policy response", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
event_type: "turn_complete",
|
||||||
|
turn: {
|
||||||
|
turn_id: "50a2d6b7-49ed-4d1e-b1c2-6d68b3f726db",
|
||||||
|
session_id: "e7f62b8e-518c-4450-82df-e65fe49f27a3",
|
||||||
|
input_messages: [
|
||||||
|
{
|
||||||
|
role: "user",
|
||||||
|
content: "nice, what's the cancellation policy?",
|
||||||
|
context: null,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
steps: [
|
||||||
|
{
|
||||||
|
turn_id: "50a2d6b7-49ed-4d1e-b1c2-6d68b3f726db",
|
||||||
|
step_id: "54074310-af42-414c-9ffe-fba5b2ead0ad",
|
||||||
|
started_at: "2025-08-27T18:15:25.870703Z",
|
||||||
|
completed_at: "2025-08-27T18:15:51.288993Z",
|
||||||
|
step_type: "inference",
|
||||||
|
model_response: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"According to the search results, the cancellation policy for Red Hat Summit is as follows:\n\n* Cancellations must be received by 5 PM EDT on April 18, 2025 for a 50% refund of the registration fee.\n* No refunds will be given for cancellations received after 5 PM EDT on April 18, 2025.\n* Cancellation of travel reservations and hotel reservations are the responsibility of the registrant.",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
output_message: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"According to the search results, the cancellation policy for Red Hat Summit is as follows:\n\n* Cancellations must be received by 5 PM EDT on April 18, 2025 for a 50% refund of the registration fee.\n* No refunds will be given for cancellations received after 5 PM EDT on April 18, 2025.\n* Cancellation of travel reservations and hotel reservations are the responsibility of the registrant.",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
output_attachments: [],
|
||||||
|
started_at: "2025-08-27T18:15:25.868548Z",
|
||||||
|
completed_at: "2025-08-27T18:15:51.289262Z",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toContain(
|
||||||
|
"According to the search results, the cancellation policy for Red Hat Summit is as follows:"
|
||||||
|
);
|
||||||
|
expect(result.text).toContain("5 PM EDT on April 18, 2025");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles turn_complete event with address response", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
event_type: "turn_complete",
|
||||||
|
turn: {
|
||||||
|
turn_id: "2f4a1520-8ecc-4cb7-bb7b-886939e042b0",
|
||||||
|
session_id: "e7f62b8e-518c-4450-82df-e65fe49f27a3",
|
||||||
|
input_messages: [
|
||||||
|
{
|
||||||
|
role: "user",
|
||||||
|
content: "what's francisco's address",
|
||||||
|
context: null,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
steps: [
|
||||||
|
{
|
||||||
|
turn_id: "2f4a1520-8ecc-4cb7-bb7b-886939e042b0",
|
||||||
|
step_id: "c13dd277-1acb-4419-8fbf-d5e2f45392ea",
|
||||||
|
started_at: "2025-08-27T18:14:52.558761Z",
|
||||||
|
completed_at: "2025-08-27T18:15:11.306032Z",
|
||||||
|
step_type: "inference",
|
||||||
|
model_response: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"Francisco Arceo's address is:\n\nRed Hat\nUnited States\n17 Primrose Ln \nBasking Ridge New Jersey 07920",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
output_message: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"Francisco Arceo's address is:\n\nRed Hat\nUnited States\n17 Primrose Ln \nBasking Ridge New Jersey 07920",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
output_attachments: [],
|
||||||
|
started_at: "2025-08-27T18:14:52.553707Z",
|
||||||
|
completed_at: "2025-08-27T18:15:11.306729Z",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toContain("Francisco Arceo's address is:");
|
||||||
|
expect(result.text).toContain("17 Primrose Ln");
|
||||||
|
expect(result.text).toContain("Basking Ridge New Jersey 07920");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles turn_complete event with ticket cost response", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
event_type: "turn_complete",
|
||||||
|
turn: {
|
||||||
|
turn_id: "7ef244a3-efee-42ca-a9c8-942865251002",
|
||||||
|
session_id: "e7f62b8e-518c-4450-82df-e65fe49f27a3",
|
||||||
|
input_messages: [
|
||||||
|
{
|
||||||
|
role: "user",
|
||||||
|
content: "what was the ticket cost for summit?",
|
||||||
|
context: null,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
steps: [
|
||||||
|
{
|
||||||
|
turn_id: "7ef244a3-efee-42ca-a9c8-942865251002",
|
||||||
|
step_id: "7651dda0-315a-472d-b1c1-3c2725f55bc5",
|
||||||
|
started_at: "2025-08-27T18:14:21.710611Z",
|
||||||
|
completed_at: "2025-08-27T18:14:39.706452Z",
|
||||||
|
step_type: "inference",
|
||||||
|
model_response: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"The ticket cost for the Red Hat Summit was $999.00 for a conference pass.",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
output_message: {
|
||||||
|
role: "assistant",
|
||||||
|
content:
|
||||||
|
"The ticket cost for the Red Hat Summit was $999.00 for a conference pass.",
|
||||||
|
stop_reason: "end_of_turn",
|
||||||
|
tool_calls: [],
|
||||||
|
},
|
||||||
|
output_attachments: [],
|
||||||
|
started_at: "2025-08-27T18:14:21.705289Z",
|
||||||
|
completed_at: "2025-08-27T18:14:39.706752Z",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe(
|
||||||
|
"The ticket cost for the Red Hat Summit was $999.00 for a conference pass."
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe("Function Call Detection", () => {
|
||||||
|
test("detects function calls in direct string chunks", () => {
|
||||||
|
const chunk =
|
||||||
|
'{"type": "function", "name": "knowledge_search", "parameters": {"query": "test"}}';
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(true);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("detects function calls in event payload content", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
content:
|
||||||
|
'{"type": "function", "name": "knowledge_search", "parameters": {"query": "test"}}',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(true);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("detects tool_calls in delta structure", () => {
|
||||||
|
const chunk = {
|
||||||
|
delta: {
|
||||||
|
tool_calls: [{ function: { name: "knowledge_search" } }],
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(true);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("detects function call in mixed content but skips it", () => {
|
||||||
|
const chunk =
|
||||||
|
'{"type": "function", "name": "knowledge_search", "parameters": {"query": "test"}} Based on the search results, here is your answer.';
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
// This is detected as a tool call and skipped entirely - the implementation prioritizes safety
|
||||||
|
expect(result.isToolCall).toBe(true);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe("Text Extraction", () => {
|
||||||
|
test("extracts text from direct string chunks", () => {
|
||||||
|
const chunk = "Hello, this is a normal response.";
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("Hello, this is a normal response.");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("extracts text from delta structure", () => {
|
||||||
|
const chunk = {
|
||||||
|
delta: {
|
||||||
|
text: "Hello, this is a normal response.",
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("Hello, this is a normal response.");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("extracts text from choices structure", () => {
|
||||||
|
const chunk = {
|
||||||
|
choices: [
|
||||||
|
{
|
||||||
|
delta: {
|
||||||
|
content: "Hello, this is a normal response.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("Hello, this is a normal response.");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("prioritizes output_message over model_response in turn structure", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
turn: {
|
||||||
|
steps: [
|
||||||
|
{
|
||||||
|
model_response: {
|
||||||
|
content: "Model response content.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
output_message: {
|
||||||
|
content: "Final output message content.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("Final output message content.");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("falls back to model_response when no output_message", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
payload: {
|
||||||
|
turn: {
|
||||||
|
steps: [
|
||||||
|
{
|
||||||
|
model_response: {
|
||||||
|
content: "This is from the model response.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("This is from the model response.");
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe("Edge Cases", () => {
|
||||||
|
test("handles empty chunks", () => {
|
||||||
|
const result = processChunk("");
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe("");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles null chunks", () => {
|
||||||
|
const result = processChunk(null);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles undefined chunks", () => {
|
||||||
|
const result = processChunk(undefined);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles chunks with no text content", () => {
|
||||||
|
const chunk = {
|
||||||
|
event: {
|
||||||
|
metadata: {
|
||||||
|
timestamp: "2024-01-01",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(false);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles malformed JSON in function calls gracefully", () => {
|
||||||
|
const chunk =
|
||||||
|
'{"type": "function", "name": "knowledge_search"} incomplete json';
|
||||||
|
const result = processChunk(chunk);
|
||||||
|
expect(result.isToolCall).toBe(true);
|
||||||
|
expect(result.text).toBe(null);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
@ -31,6 +31,9 @@ const mockClient = {
|
||||||
toolgroups: {
|
toolgroups: {
|
||||||
list: jest.fn(),
|
list: jest.fn(),
|
||||||
},
|
},
|
||||||
|
vectorDBs: {
|
||||||
|
list: jest.fn(),
|
||||||
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
jest.mock("@/hooks/use-auth-client", () => ({
|
jest.mock("@/hooks/use-auth-client", () => ({
|
||||||
|
|
@ -164,7 +167,7 @@ describe("ChatPlaygroundPage", () => {
|
||||||
session_name: "Test Session",
|
session_name: "Test Session",
|
||||||
started_at: new Date().toISOString(),
|
started_at: new Date().toISOString(),
|
||||||
turns: [],
|
turns: [],
|
||||||
}); // No turns by default
|
});
|
||||||
mockClient.agents.retrieve.mockResolvedValue({
|
mockClient.agents.retrieve.mockResolvedValue({
|
||||||
agent_id: "test-agent",
|
agent_id: "test-agent",
|
||||||
agent_config: {
|
agent_config: {
|
||||||
|
|
@ -417,7 +420,6 @@ describe("ChatPlaygroundPage", () => {
|
||||||
});
|
});
|
||||||
|
|
||||||
await waitFor(() => {
|
await waitFor(() => {
|
||||||
// first agent should be auto-selected
|
|
||||||
expect(mockClient.agents.session.create).toHaveBeenCalledWith(
|
expect(mockClient.agents.session.create).toHaveBeenCalledWith(
|
||||||
"agent_123",
|
"agent_123",
|
||||||
{ session_name: "Default Session" }
|
{ session_name: "Default Session" }
|
||||||
|
|
@ -464,7 +466,7 @@ describe("ChatPlaygroundPage", () => {
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
test("hides delete button when only one agent exists", async () => {
|
test("shows delete button even when only one agent exists", async () => {
|
||||||
mockClient.agents.list.mockResolvedValue({
|
mockClient.agents.list.mockResolvedValue({
|
||||||
data: [mockAgents[0]],
|
data: [mockAgents[0]],
|
||||||
});
|
});
|
||||||
|
|
@ -474,9 +476,7 @@ describe("ChatPlaygroundPage", () => {
|
||||||
});
|
});
|
||||||
|
|
||||||
await waitFor(() => {
|
await waitFor(() => {
|
||||||
expect(
|
expect(screen.getByTitle("Delete current agent")).toBeInTheDocument();
|
||||||
screen.queryByTitle("Delete current agent")
|
|
||||||
).not.toBeInTheDocument();
|
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|
@ -505,7 +505,7 @@ describe("ChatPlaygroundPage", () => {
|
||||||
await waitFor(() => {
|
await waitFor(() => {
|
||||||
expect(mockClient.agents.delete).toHaveBeenCalledWith("agent_123");
|
expect(mockClient.agents.delete).toHaveBeenCalledWith("agent_123");
|
||||||
expect(global.confirm).toHaveBeenCalledWith(
|
expect(global.confirm).toHaveBeenCalledWith(
|
||||||
"Are you sure you want to delete this agent? This action cannot be undone and will delete all associated sessions."
|
"Are you sure you want to delete this agent? This action cannot be undone and will delete the agent and all its sessions."
|
||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|
@ -584,4 +584,207 @@ describe("ChatPlaygroundPage", () => {
|
||||||
consoleSpy.mockRestore();
|
consoleSpy.mockRestore();
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
describe("RAG File Upload", () => {
|
||||||
|
let mockFileReader: {
|
||||||
|
readAsDataURL: jest.Mock;
|
||||||
|
readAsText: jest.Mock;
|
||||||
|
result: string | null;
|
||||||
|
onload: (() => void) | null;
|
||||||
|
onerror: (() => void) | null;
|
||||||
|
};
|
||||||
|
let mockRAGTool: {
|
||||||
|
insert: jest.Mock;
|
||||||
|
};
|
||||||
|
|
||||||
|
beforeEach(() => {
|
||||||
|
mockFileReader = {
|
||||||
|
readAsDataURL: jest.fn(),
|
||||||
|
readAsText: jest.fn(),
|
||||||
|
result: null,
|
||||||
|
onload: null,
|
||||||
|
onerror: null,
|
||||||
|
};
|
||||||
|
global.FileReader = jest.fn(() => mockFileReader);
|
||||||
|
|
||||||
|
mockRAGTool = {
|
||||||
|
insert: jest.fn().mockResolvedValue({}),
|
||||||
|
};
|
||||||
|
mockClient.toolRuntime = {
|
||||||
|
ragTool: mockRAGTool,
|
||||||
|
};
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(() => {
|
||||||
|
jest.clearAllMocks();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles text file upload", async () => {
|
||||||
|
new File(["Hello, world!"], "test.txt", {
|
||||||
|
type: "text/plain",
|
||||||
|
});
|
||||||
|
|
||||||
|
mockClient.agents.retrieve.mockResolvedValue({
|
||||||
|
agent_id: "test-agent",
|
||||||
|
agent_config: {
|
||||||
|
toolgroups: [
|
||||||
|
{
|
||||||
|
name: "builtin::rag/knowledge_search",
|
||||||
|
args: { vector_db_ids: ["test-vector-db"] },
|
||||||
|
},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
await act(async () => {
|
||||||
|
render(<ChatPlaygroundPage />);
|
||||||
|
});
|
||||||
|
|
||||||
|
await waitFor(() => {
|
||||||
|
expect(screen.getByTestId("chat-component")).toBeInTheDocument();
|
||||||
|
});
|
||||||
|
|
||||||
|
const chatComponent = screen.getByTestId("chat-component");
|
||||||
|
chatComponent.getAttribute("data-onragfileupload");
|
||||||
|
|
||||||
|
// this is a simplified test
|
||||||
|
expect(mockRAGTool.insert).not.toHaveBeenCalled();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles PDF file upload with FileReader", async () => {
|
||||||
|
new File([new ArrayBuffer(1000)], "test.pdf", {
|
||||||
|
type: "application/pdf",
|
||||||
|
});
|
||||||
|
|
||||||
|
const mockDataURL = "data:application/pdf;base64,JVBERi0xLjQK";
|
||||||
|
mockFileReader.result = mockDataURL;
|
||||||
|
|
||||||
|
mockClient.agents.retrieve.mockResolvedValue({
|
||||||
|
agent_id: "test-agent",
|
||||||
|
agent_config: {
|
||||||
|
toolgroups: [
|
||||||
|
{
|
||||||
|
name: "builtin::rag/knowledge_search",
|
||||||
|
args: { vector_db_ids: ["test-vector-db"] },
|
||||||
|
},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
await act(async () => {
|
||||||
|
render(<ChatPlaygroundPage />);
|
||||||
|
});
|
||||||
|
|
||||||
|
await waitFor(() => {
|
||||||
|
expect(screen.getByTestId("chat-component")).toBeInTheDocument();
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(global.FileReader).toBeDefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles different file types correctly", () => {
|
||||||
|
const getContentType = (filename: string): string => {
|
||||||
|
const ext = filename.toLowerCase().split(".").pop();
|
||||||
|
switch (ext) {
|
||||||
|
case "pdf":
|
||||||
|
return "application/pdf";
|
||||||
|
case "txt":
|
||||||
|
return "text/plain";
|
||||||
|
case "md":
|
||||||
|
return "text/markdown";
|
||||||
|
case "html":
|
||||||
|
return "text/html";
|
||||||
|
case "csv":
|
||||||
|
return "text/csv";
|
||||||
|
case "json":
|
||||||
|
return "application/json";
|
||||||
|
case "docx":
|
||||||
|
return "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
|
||||||
|
case "doc":
|
||||||
|
return "application/msword";
|
||||||
|
default:
|
||||||
|
return "application/octet-stream";
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
expect(getContentType("test.pdf")).toBe("application/pdf");
|
||||||
|
expect(getContentType("test.txt")).toBe("text/plain");
|
||||||
|
expect(getContentType("test.md")).toBe("text/markdown");
|
||||||
|
expect(getContentType("test.html")).toBe("text/html");
|
||||||
|
expect(getContentType("test.csv")).toBe("text/csv");
|
||||||
|
expect(getContentType("test.json")).toBe("application/json");
|
||||||
|
expect(getContentType("test.docx")).toBe(
|
||||||
|
"application/vnd.openxmlformats-officedocument.wordprocessingml.document"
|
||||||
|
);
|
||||||
|
expect(getContentType("test.doc")).toBe("application/msword");
|
||||||
|
expect(getContentType("test.unknown")).toBe("application/octet-stream");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("determines text vs binary file types correctly", () => {
|
||||||
|
const isTextFile = (mimeType: string): boolean => {
|
||||||
|
return (
|
||||||
|
mimeType.startsWith("text/") ||
|
||||||
|
mimeType === "application/json" ||
|
||||||
|
mimeType === "text/markdown" ||
|
||||||
|
mimeType === "text/html" ||
|
||||||
|
mimeType === "text/csv"
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
expect(isTextFile("text/plain")).toBe(true);
|
||||||
|
expect(isTextFile("text/markdown")).toBe(true);
|
||||||
|
expect(isTextFile("text/html")).toBe(true);
|
||||||
|
expect(isTextFile("text/csv")).toBe(true);
|
||||||
|
expect(isTextFile("application/json")).toBe(true);
|
||||||
|
|
||||||
|
expect(isTextFile("application/pdf")).toBe(false);
|
||||||
|
expect(isTextFile("application/msword")).toBe(false);
|
||||||
|
expect(
|
||||||
|
isTextFile(
|
||||||
|
"application/vnd.openxmlformats-officedocument.wordprocessingml.document"
|
||||||
|
)
|
||||||
|
).toBe(false);
|
||||||
|
expect(isTextFile("application/octet-stream")).toBe(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles FileReader error gracefully", async () => {
|
||||||
|
const pdfFile = new File([new ArrayBuffer(1000)], "test.pdf", {
|
||||||
|
type: "application/pdf",
|
||||||
|
});
|
||||||
|
|
||||||
|
mockFileReader.onerror = jest.fn();
|
||||||
|
const mockError = new Error("FileReader failed");
|
||||||
|
|
||||||
|
const fileReaderPromise = new Promise<string>((resolve, reject) => {
|
||||||
|
const reader = new FileReader();
|
||||||
|
reader.onload = () => resolve(reader.result as string);
|
||||||
|
reader.onerror = () => reject(reader.error || mockError);
|
||||||
|
reader.readAsDataURL(pdfFile);
|
||||||
|
|
||||||
|
setTimeout(() => {
|
||||||
|
reader.onerror?.(new ProgressEvent("error"));
|
||||||
|
}, 0);
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect(fileReaderPromise).rejects.toBeDefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handles large file upload with FileReader approach", () => {
|
||||||
|
// create a large file
|
||||||
|
const largeFile = new File(
|
||||||
|
[new ArrayBuffer(10 * 1024 * 1024)],
|
||||||
|
"large.pdf",
|
||||||
|
{
|
||||||
|
type: "application/pdf",
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(largeFile.size).toBe(10 * 1024 * 1024); // 10MB
|
||||||
|
|
||||||
|
expect(global.FileReader).toBeDefined();
|
||||||
|
|
||||||
|
const reader = new FileReader();
|
||||||
|
expect(reader.readAsDataURL).toBeDefined();
|
||||||
|
});
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -35,6 +35,7 @@ interface ChatPropsBase {
|
||||||
) => void;
|
) => void;
|
||||||
setMessages?: (messages: Message[]) => void;
|
setMessages?: (messages: Message[]) => void;
|
||||||
transcribeAudio?: (blob: Blob) => Promise<string>;
|
transcribeAudio?: (blob: Blob) => Promise<string>;
|
||||||
|
onRAGFileUpload?: (file: File) => Promise<void>;
|
||||||
}
|
}
|
||||||
|
|
||||||
interface ChatPropsWithoutSuggestions extends ChatPropsBase {
|
interface ChatPropsWithoutSuggestions extends ChatPropsBase {
|
||||||
|
|
@ -62,6 +63,7 @@ export function Chat({
|
||||||
onRateResponse,
|
onRateResponse,
|
||||||
setMessages,
|
setMessages,
|
||||||
transcribeAudio,
|
transcribeAudio,
|
||||||
|
onRAGFileUpload,
|
||||||
}: ChatProps) {
|
}: ChatProps) {
|
||||||
const lastMessage = messages.at(-1);
|
const lastMessage = messages.at(-1);
|
||||||
const isEmpty = messages.length === 0;
|
const isEmpty = messages.length === 0;
|
||||||
|
|
@ -226,16 +228,17 @@ export function Chat({
|
||||||
isPending={isGenerating || isTyping}
|
isPending={isGenerating || isTyping}
|
||||||
handleSubmit={handleSubmit}
|
handleSubmit={handleSubmit}
|
||||||
>
|
>
|
||||||
{({ files, setFiles }) => (
|
{() => (
|
||||||
<MessageInput
|
<MessageInput
|
||||||
value={input}
|
value={input}
|
||||||
onChange={handleInputChange}
|
onChange={handleInputChange}
|
||||||
allowAttachments
|
allowAttachments={true}
|
||||||
files={files}
|
files={null}
|
||||||
setFiles={setFiles}
|
setFiles={() => {}}
|
||||||
stop={handleStop}
|
stop={handleStop}
|
||||||
isGenerating={isGenerating}
|
isGenerating={isGenerating}
|
||||||
transcribeAudio={transcribeAudio}
|
transcribeAudio={transcribeAudio}
|
||||||
|
onRAGFileUpload={onRAGFileUpload}
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
</ChatForm>
|
</ChatForm>
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ import { Card } from "@/components/ui/card";
|
||||||
import { Trash2 } from "lucide-react";
|
import { Trash2 } from "lucide-react";
|
||||||
import type { Message } from "@/components/chat-playground/chat-message";
|
import type { Message } from "@/components/chat-playground/chat-message";
|
||||||
import { useAuthClient } from "@/hooks/use-auth-client";
|
import { useAuthClient } from "@/hooks/use-auth-client";
|
||||||
|
import { cleanMessageContent } from "@/lib/message-content-utils";
|
||||||
import type {
|
import type {
|
||||||
Session,
|
Session,
|
||||||
SessionCreateParams,
|
SessionCreateParams,
|
||||||
|
|
@ -219,10 +220,7 @@ export function Conversations({
|
||||||
messages.push({
|
messages.push({
|
||||||
id: `${turn.turn_id}-assistant-${messages.length}`,
|
id: `${turn.turn_id}-assistant-${messages.length}`,
|
||||||
role: "assistant",
|
role: "assistant",
|
||||||
content:
|
content: cleanMessageContent(turn.output_message.content),
|
||||||
typeof turn.output_message.content === "string"
|
|
||||||
? turn.output_message.content
|
|
||||||
: JSON.stringify(turn.output_message.content),
|
|
||||||
createdAt: new Date(
|
createdAt: new Date(
|
||||||
turn.completed_at || turn.started_at || Date.now()
|
turn.completed_at || turn.started_at || Date.now()
|
||||||
),
|
),
|
||||||
|
|
@ -271,7 +269,7 @@ export function Conversations({
|
||||||
);
|
);
|
||||||
|
|
||||||
const deleteSession = async (sessionId: string) => {
|
const deleteSession = async (sessionId: string) => {
|
||||||
if (sessions.length <= 1 || !selectedAgentId) {
|
if (!selectedAgentId) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -324,7 +322,6 @@ export function Conversations({
|
||||||
}
|
}
|
||||||
}, [currentSession]);
|
}, [currentSession]);
|
||||||
|
|
||||||
// Don't render if no agent is selected
|
|
||||||
if (!selectedAgentId) {
|
if (!selectedAgentId) {
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
@ -357,7 +354,7 @@ export function Conversations({
|
||||||
+ New
|
+ New
|
||||||
</Button>
|
</Button>
|
||||||
|
|
||||||
{currentSession && sessions.length > 1 && (
|
{currentSession && (
|
||||||
<Button
|
<Button
|
||||||
onClick={() => deleteSession(currentSession.id)}
|
onClick={() => deleteSession(currentSession.id)}
|
||||||
variant="outline"
|
variant="outline"
|
||||||
|
|
|
||||||
|
|
@ -21,6 +21,7 @@ interface MessageInputBaseProps
|
||||||
isGenerating: boolean;
|
isGenerating: boolean;
|
||||||
enableInterrupt?: boolean;
|
enableInterrupt?: boolean;
|
||||||
transcribeAudio?: (blob: Blob) => Promise<string>;
|
transcribeAudio?: (blob: Blob) => Promise<string>;
|
||||||
|
onRAGFileUpload?: (file: File) => Promise<void>;
|
||||||
}
|
}
|
||||||
|
|
||||||
interface MessageInputWithoutAttachmentProps extends MessageInputBaseProps {
|
interface MessageInputWithoutAttachmentProps extends MessageInputBaseProps {
|
||||||
|
|
@ -213,8 +214,13 @@ export function MessageInput({
|
||||||
className
|
className
|
||||||
)}
|
)}
|
||||||
{...(props.allowAttachments
|
{...(props.allowAttachments
|
||||||
? omit(props, ["allowAttachments", "files", "setFiles"])
|
? omit(props, [
|
||||||
: omit(props, ["allowAttachments"]))}
|
"allowAttachments",
|
||||||
|
"files",
|
||||||
|
"setFiles",
|
||||||
|
"onRAGFileUpload",
|
||||||
|
])
|
||||||
|
: omit(props, ["allowAttachments", "onRAGFileUpload"]))}
|
||||||
/>
|
/>
|
||||||
|
|
||||||
{props.allowAttachments && (
|
{props.allowAttachments && (
|
||||||
|
|
@ -254,11 +260,19 @@ export function MessageInput({
|
||||||
size="icon"
|
size="icon"
|
||||||
variant="outline"
|
variant="outline"
|
||||||
className="h-8 w-8"
|
className="h-8 w-8"
|
||||||
aria-label="Attach a file"
|
aria-label="Upload file to RAG"
|
||||||
disabled={true}
|
disabled={false}
|
||||||
onClick={async () => {
|
onClick={async () => {
|
||||||
const files = await showFileUploadDialog();
|
const input = document.createElement("input");
|
||||||
addFiles(files);
|
input.type = "file";
|
||||||
|
input.accept = ".pdf,.txt,.md,.html,.csv,.json";
|
||||||
|
input.onchange = async e => {
|
||||||
|
const file = (e.target as HTMLInputElement).files?.[0];
|
||||||
|
if (file && props.onRAGFileUpload) {
|
||||||
|
await props.onRAGFileUpload(file);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
input.click();
|
||||||
}}
|
}}
|
||||||
>
|
>
|
||||||
<Paperclip className="h-4 w-4" />
|
<Paperclip className="h-4 w-4" />
|
||||||
|
|
@ -337,28 +351,6 @@ function FileUploadOverlay({ isDragging }: FileUploadOverlayProps) {
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
function showFileUploadDialog() {
|
|
||||||
const input = document.createElement("input");
|
|
||||||
|
|
||||||
input.type = "file";
|
|
||||||
input.multiple = true;
|
|
||||||
input.accept = "*/*";
|
|
||||||
input.click();
|
|
||||||
|
|
||||||
return new Promise<File[] | null>(resolve => {
|
|
||||||
input.onchange = e => {
|
|
||||||
const files = (e.currentTarget as HTMLInputElement).files;
|
|
||||||
|
|
||||||
if (files) {
|
|
||||||
resolve(Array.from(files));
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
resolve(null);
|
|
||||||
};
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
function TranscribingOverlay() {
|
function TranscribingOverlay() {
|
||||||
return (
|
return (
|
||||||
<motion.div
|
<motion.div
|
||||||
|
|
|
||||||
243
llama_stack/ui/components/chat-playground/vector-db-creator.tsx
Normal file
243
llama_stack/ui/components/chat-playground/vector-db-creator.tsx
Normal file
|
|
@ -0,0 +1,243 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState, useEffect } from "react";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Card } from "@/components/ui/card";
|
||||||
|
import {
|
||||||
|
Select,
|
||||||
|
SelectContent,
|
||||||
|
SelectItem,
|
||||||
|
SelectTrigger,
|
||||||
|
SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import { useAuthClient } from "@/hooks/use-auth-client";
|
||||||
|
import type { Model } from "llama-stack-client/resources/models";
|
||||||
|
|
||||||
|
interface VectorDBCreatorProps {
|
||||||
|
models: Model[];
|
||||||
|
onVectorDBCreated?: (vectorDbId: string) => void;
|
||||||
|
onCancel?: () => void;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface VectorDBProvider {
|
||||||
|
api: string;
|
||||||
|
provider_id: string;
|
||||||
|
provider_type: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function VectorDBCreator({
|
||||||
|
models,
|
||||||
|
onVectorDBCreated,
|
||||||
|
onCancel,
|
||||||
|
}: VectorDBCreatorProps) {
|
||||||
|
const [vectorDbName, setVectorDbName] = useState("");
|
||||||
|
const [selectedEmbeddingModel, setSelectedEmbeddingModel] = useState("");
|
||||||
|
const [selectedProvider, setSelectedProvider] = useState("faiss");
|
||||||
|
const [availableProviders, setAvailableProviders] = useState<
|
||||||
|
VectorDBProvider[]
|
||||||
|
>([]);
|
||||||
|
const [isCreating, setIsCreating] = useState(false);
|
||||||
|
const [isLoadingProviders, setIsLoadingProviders] = useState(false);
|
||||||
|
const [error, setError] = useState<string | null>(null);
|
||||||
|
const client = useAuthClient();
|
||||||
|
|
||||||
|
const embeddingModels = models.filter(
|
||||||
|
model => model.model_type === "embedding"
|
||||||
|
);
|
||||||
|
|
||||||
|
useEffect(() => {
|
||||||
|
const fetchProviders = async () => {
|
||||||
|
setIsLoadingProviders(true);
|
||||||
|
try {
|
||||||
|
const providersResponse = await client.providers.list();
|
||||||
|
|
||||||
|
const vectorIoProviders = providersResponse.filter(
|
||||||
|
(provider: VectorDBProvider) => provider.api === "vector_io"
|
||||||
|
);
|
||||||
|
|
||||||
|
setAvailableProviders(vectorIoProviders);
|
||||||
|
|
||||||
|
if (vectorIoProviders.length > 0) {
|
||||||
|
const faissProvider = vectorIoProviders.find(
|
||||||
|
(p: VectorDBProvider) => p.provider_id === "faiss"
|
||||||
|
);
|
||||||
|
setSelectedProvider(
|
||||||
|
faissProvider?.provider_id || vectorIoProviders[0].provider_id
|
||||||
|
);
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
console.error("Error fetching providers:", err);
|
||||||
|
setAvailableProviders([
|
||||||
|
{
|
||||||
|
api: "vector_io",
|
||||||
|
provider_id: "faiss",
|
||||||
|
provider_type: "inline::faiss",
|
||||||
|
},
|
||||||
|
]);
|
||||||
|
} finally {
|
||||||
|
setIsLoadingProviders(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
fetchProviders();
|
||||||
|
}, [client]);
|
||||||
|
|
||||||
|
const handleCreate = async () => {
|
||||||
|
if (!vectorDbName.trim() || !selectedEmbeddingModel) {
|
||||||
|
setError("Please provide a name and select an embedding model");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsCreating(true);
|
||||||
|
setError(null);
|
||||||
|
|
||||||
|
try {
|
||||||
|
const embeddingModel = embeddingModels.find(
|
||||||
|
m => m.identifier === selectedEmbeddingModel
|
||||||
|
);
|
||||||
|
|
||||||
|
if (!embeddingModel) {
|
||||||
|
throw new Error("Selected embedding model not found");
|
||||||
|
}
|
||||||
|
|
||||||
|
const embeddingDimension = embeddingModel.metadata
|
||||||
|
?.embedding_dimension as number;
|
||||||
|
|
||||||
|
if (!embeddingDimension) {
|
||||||
|
throw new Error("Embedding dimension not available for selected model");
|
||||||
|
}
|
||||||
|
|
||||||
|
const vectorDbId = vectorDbName.trim() || `vector_db_${Date.now()}`;
|
||||||
|
|
||||||
|
const response = await client.vectorDBs.register({
|
||||||
|
vector_db_id: vectorDbId,
|
||||||
|
embedding_model: selectedEmbeddingModel,
|
||||||
|
embedding_dimension: embeddingDimension,
|
||||||
|
provider_id: selectedProvider,
|
||||||
|
});
|
||||||
|
|
||||||
|
onVectorDBCreated?.(response.identifier || vectorDbId);
|
||||||
|
} catch (err) {
|
||||||
|
console.error("Error creating vector DB:", err);
|
||||||
|
setError(
|
||||||
|
err instanceof Error ? err.message : "Failed to create vector DB"
|
||||||
|
);
|
||||||
|
} finally {
|
||||||
|
setIsCreating(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="p-6 space-y-4">
|
||||||
|
<h3 className="text-lg font-semibold">Create Vector Database</h3>
|
||||||
|
|
||||||
|
<div className="space-y-4">
|
||||||
|
<div>
|
||||||
|
<label className="text-sm font-medium block mb-2">
|
||||||
|
Vector DB Name
|
||||||
|
</label>
|
||||||
|
<Input
|
||||||
|
value={vectorDbName}
|
||||||
|
onChange={e => setVectorDbName(e.target.value)}
|
||||||
|
placeholder="My Vector Database"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label className="text-sm font-medium block mb-2">
|
||||||
|
Embedding Model
|
||||||
|
</label>
|
||||||
|
<Select
|
||||||
|
value={selectedEmbeddingModel}
|
||||||
|
onValueChange={setSelectedEmbeddingModel}
|
||||||
|
>
|
||||||
|
<SelectTrigger>
|
||||||
|
<SelectValue placeholder="Select Embedding Model" />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{embeddingModels.map(model => (
|
||||||
|
<SelectItem key={model.identifier} value={model.identifier}>
|
||||||
|
{model.identifier}
|
||||||
|
</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
{selectedEmbeddingModel && (
|
||||||
|
<p className="text-xs text-muted-foreground mt-1">
|
||||||
|
Dimension:{" "}
|
||||||
|
{embeddingModels.find(
|
||||||
|
m => m.identifier === selectedEmbeddingModel
|
||||||
|
)?.metadata?.embedding_dimension || "Unknown"}
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label className="text-sm font-medium block mb-2">
|
||||||
|
Vector Database Provider
|
||||||
|
</label>
|
||||||
|
<Select
|
||||||
|
value={selectedProvider}
|
||||||
|
onValueChange={setSelectedProvider}
|
||||||
|
disabled={isLoadingProviders}
|
||||||
|
>
|
||||||
|
<SelectTrigger>
|
||||||
|
<SelectValue
|
||||||
|
placeholder={
|
||||||
|
isLoadingProviders
|
||||||
|
? "Loading providers..."
|
||||||
|
: "Select Provider"
|
||||||
|
}
|
||||||
|
/>
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{availableProviders.map(provider => (
|
||||||
|
<SelectItem
|
||||||
|
key={provider.provider_id}
|
||||||
|
value={provider.provider_id}
|
||||||
|
>
|
||||||
|
{provider.provider_id}
|
||||||
|
</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
{selectedProvider && (
|
||||||
|
<p className="text-xs text-muted-foreground mt-1">
|
||||||
|
Selected provider: {selectedProvider}
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{error && (
|
||||||
|
<div className="text-destructive text-sm bg-destructive/10 p-2 rounded">
|
||||||
|
{error}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="flex gap-2 pt-2">
|
||||||
|
<Button
|
||||||
|
onClick={handleCreate}
|
||||||
|
disabled={
|
||||||
|
isCreating || !vectorDbName.trim() || !selectedEmbeddingModel
|
||||||
|
}
|
||||||
|
className="flex-1"
|
||||||
|
>
|
||||||
|
{isCreating ? "Creating..." : "Create Vector DB"}
|
||||||
|
</Button>
|
||||||
|
{onCancel && (
|
||||||
|
<Button variant="outline" onClick={onCancel} className="flex-1">
|
||||||
|
Cancel
|
||||||
|
</Button>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="text-xs text-muted-foreground bg-muted/50 p-3 rounded">
|
||||||
|
<strong>Note:</strong> This will create a new vector database that can
|
||||||
|
be used with RAG tools. After creation, you'll be able to upload
|
||||||
|
documents and use it for knowledge search in your agent conversations.
|
||||||
|
</div>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
51
llama_stack/ui/lib/message-content-utils.ts
Normal file
51
llama_stack/ui/lib/message-content-utils.ts
Normal file
|
|
@ -0,0 +1,51 @@
|
||||||
|
// check if content contains function call JSON
|
||||||
|
export const containsToolCall = (content: string): boolean => {
|
||||||
|
return (
|
||||||
|
content.includes('"type": "function"') ||
|
||||||
|
content.includes('"name": "knowledge_search"') ||
|
||||||
|
content.includes('"parameters":') ||
|
||||||
|
!!content.match(/\{"type":\s*"function".*?\}/)
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
export const extractCleanText = (content: string): string | null => {
|
||||||
|
if (containsToolCall(content)) {
|
||||||
|
try {
|
||||||
|
// parse and extract non-function call parts
|
||||||
|
const jsonMatch = content.match(/\{"type":\s*"function"[^}]*\}[^}]*\}/);
|
||||||
|
if (jsonMatch) {
|
||||||
|
const jsonPart = jsonMatch[0];
|
||||||
|
const parsedJson = JSON.parse(jsonPart);
|
||||||
|
|
||||||
|
// if function call, extract text after JSON
|
||||||
|
if (parsedJson.type === "function") {
|
||||||
|
const textAfterJson = content
|
||||||
|
.substring(content.indexOf(jsonPart) + jsonPart.length)
|
||||||
|
.trim();
|
||||||
|
return textAfterJson || null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
} catch {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return content;
|
||||||
|
};
|
||||||
|
|
||||||
|
// removes function call JSON handling different content types
|
||||||
|
export const cleanMessageContent = (
|
||||||
|
content: string | unknown[] | unknown
|
||||||
|
): string => {
|
||||||
|
if (typeof content === "string") {
|
||||||
|
const cleaned = extractCleanText(content);
|
||||||
|
return cleaned || "";
|
||||||
|
} else if (Array.isArray(content)) {
|
||||||
|
return content
|
||||||
|
.filter((item: { type: string }) => item.type === "text")
|
||||||
|
.map((item: { text: string }) => item.text)
|
||||||
|
.join("");
|
||||||
|
} else {
|
||||||
|
return JSON.stringify(content);
|
||||||
|
}
|
||||||
|
};
|
||||||
83
llama_stack/ui/package-lock.json
generated
83
llama_stack/ui/package-lock.json
generated
|
|
@ -18,7 +18,7 @@
|
||||||
"class-variance-authority": "^0.7.1",
|
"class-variance-authority": "^0.7.1",
|
||||||
"clsx": "^2.1.1",
|
"clsx": "^2.1.1",
|
||||||
"framer-motion": "^11.18.2",
|
"framer-motion": "^11.18.2",
|
||||||
"llama-stack-client": "^0.2.18",
|
"llama-stack-client": "^0.2.19",
|
||||||
"lucide-react": "^0.510.0",
|
"lucide-react": "^0.510.0",
|
||||||
"next": "15.3.3",
|
"next": "15.3.3",
|
||||||
"next-auth": "^4.24.11",
|
"next-auth": "^4.24.11",
|
||||||
|
|
@ -27,7 +27,7 @@
|
||||||
"react-dom": "^19.0.0",
|
"react-dom": "^19.0.0",
|
||||||
"react-markdown": "^10.1.0",
|
"react-markdown": "^10.1.0",
|
||||||
"remark-gfm": "^4.0.1",
|
"remark-gfm": "^4.0.1",
|
||||||
"remeda": "^2.26.1",
|
"remeda": "^2.30.0",
|
||||||
"shiki": "^1.29.2",
|
"shiki": "^1.29.2",
|
||||||
"sonner": "^2.0.6",
|
"sonner": "^2.0.6",
|
||||||
"tailwind-merge": "^3.3.1"
|
"tailwind-merge": "^3.3.1"
|
||||||
|
|
@ -35,8 +35,8 @@
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@eslint/eslintrc": "^3",
|
"@eslint/eslintrc": "^3",
|
||||||
"@tailwindcss/postcss": "^4",
|
"@tailwindcss/postcss": "^4",
|
||||||
"@testing-library/dom": "^10.4.0",
|
"@testing-library/dom": "^10.4.1",
|
||||||
"@testing-library/jest-dom": "^6.6.3",
|
"@testing-library/jest-dom": "^6.8.0",
|
||||||
"@testing-library/react": "^16.3.0",
|
"@testing-library/react": "^16.3.0",
|
||||||
"@types/jest": "^29.5.14",
|
"@types/jest": "^29.5.14",
|
||||||
"@types/node": "^20",
|
"@types/node": "^20",
|
||||||
|
|
@ -45,7 +45,7 @@
|
||||||
"eslint": "^9",
|
"eslint": "^9",
|
||||||
"eslint-config-next": "15.3.2",
|
"eslint-config-next": "15.3.2",
|
||||||
"eslint-config-prettier": "^10.1.8",
|
"eslint-config-prettier": "^10.1.8",
|
||||||
"eslint-plugin-prettier": "^5.4.0",
|
"eslint-plugin-prettier": "^5.5.4",
|
||||||
"jest": "^29.7.0",
|
"jest": "^29.7.0",
|
||||||
"jest-environment-jsdom": "^29.7.0",
|
"jest-environment-jsdom": "^29.7.0",
|
||||||
"prettier": "3.5.3",
|
"prettier": "3.5.3",
|
||||||
|
|
@ -2041,9 +2041,9 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/@pkgr/core": {
|
"node_modules/@pkgr/core": {
|
||||||
"version": "0.2.4",
|
"version": "0.2.9",
|
||||||
"resolved": "https://registry.npmjs.org/@pkgr/core/-/core-0.2.4.tgz",
|
"resolved": "https://registry.npmjs.org/@pkgr/core/-/core-0.2.9.tgz",
|
||||||
"integrity": "sha512-ROFF39F6ZrnzSUEmQQZUar0Jt4xVoP9WnDRdWwF4NNcXs3xBTLgBUDoOwW141y1jP+S8nahIbdxbFC7IShw9Iw==",
|
"integrity": "sha512-QNqXyfVS2wm9hweSYD2O7F0G06uurj9kZ96TRQE5Y9hU7+tgdZwIkbAKc5Ocy1HxEY2kuDQa6cQ1WRs/O5LFKA==",
|
||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"engines": {
|
"engines": {
|
||||||
|
|
@ -3567,9 +3567,9 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/@testing-library/dom": {
|
"node_modules/@testing-library/dom": {
|
||||||
"version": "10.4.0",
|
"version": "10.4.1",
|
||||||
"resolved": "https://registry.npmjs.org/@testing-library/dom/-/dom-10.4.0.tgz",
|
"resolved": "https://registry.npmjs.org/@testing-library/dom/-/dom-10.4.1.tgz",
|
||||||
"integrity": "sha512-pemlzrSESWbdAloYml3bAJMEfNh1Z7EduzqPKprCH5S341frlpYnUEW0H72dLxa6IsYr+mPno20GiSm+h9dEdQ==",
|
"integrity": "sha512-o4PXJQidqJl82ckFaXUeoAW+XysPLauYI43Abki5hABd853iMhitooc6znOnczgbTYmEP6U6/y1ZyKAIsvMKGg==",
|
||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
|
|
@ -3577,9 +3577,9 @@
|
||||||
"@babel/runtime": "^7.12.5",
|
"@babel/runtime": "^7.12.5",
|
||||||
"@types/aria-query": "^5.0.1",
|
"@types/aria-query": "^5.0.1",
|
||||||
"aria-query": "5.3.0",
|
"aria-query": "5.3.0",
|
||||||
"chalk": "^4.1.0",
|
|
||||||
"dom-accessibility-api": "^0.5.9",
|
"dom-accessibility-api": "^0.5.9",
|
||||||
"lz-string": "^1.5.0",
|
"lz-string": "^1.5.0",
|
||||||
|
"picocolors": "1.1.1",
|
||||||
"pretty-format": "^27.0.2"
|
"pretty-format": "^27.0.2"
|
||||||
},
|
},
|
||||||
"engines": {
|
"engines": {
|
||||||
|
|
@ -3597,18 +3597,17 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/@testing-library/jest-dom": {
|
"node_modules/@testing-library/jest-dom": {
|
||||||
"version": "6.6.3",
|
"version": "6.8.0",
|
||||||
"resolved": "https://registry.npmjs.org/@testing-library/jest-dom/-/jest-dom-6.6.3.tgz",
|
"resolved": "https://registry.npmjs.org/@testing-library/jest-dom/-/jest-dom-6.8.0.tgz",
|
||||||
"integrity": "sha512-IteBhl4XqYNkM54f4ejhLRJiZNqcSCoXUOG2CPK7qbD322KjQozM4kHQOfkG2oln9b9HTYqs+Sae8vBATubxxA==",
|
"integrity": "sha512-WgXcWzVM6idy5JaftTVC8Vs83NKRmGJz4Hqs4oyOuO2J4r/y79vvKZsb+CaGyCSEbUPI6OsewfPd0G1A0/TUZQ==",
|
||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@adobe/css-tools": "^4.4.0",
|
"@adobe/css-tools": "^4.4.0",
|
||||||
"aria-query": "^5.0.0",
|
"aria-query": "^5.0.0",
|
||||||
"chalk": "^3.0.0",
|
|
||||||
"css.escape": "^1.5.1",
|
"css.escape": "^1.5.1",
|
||||||
"dom-accessibility-api": "^0.6.3",
|
"dom-accessibility-api": "^0.6.3",
|
||||||
"lodash": "^4.17.21",
|
"picocolors": "^1.1.1",
|
||||||
"redent": "^3.0.0"
|
"redent": "^3.0.0"
|
||||||
},
|
},
|
||||||
"engines": {
|
"engines": {
|
||||||
|
|
@ -3617,20 +3616,6 @@
|
||||||
"yarn": ">=1"
|
"yarn": ">=1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/@testing-library/jest-dom/node_modules/chalk": {
|
|
||||||
"version": "3.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/chalk/-/chalk-3.0.0.tgz",
|
|
||||||
"integrity": "sha512-4D3B6Wf41KOYRFdszmDqMCGq5VV/uMAB273JILmO+3jAlh8X4qDtdtgCR3fxtbLEMzSx22QdhnDcJvu2u1fVwg==",
|
|
||||||
"dev": true,
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"ansi-styles": "^4.1.0",
|
|
||||||
"supports-color": "^7.1.0"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=8"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/@testing-library/jest-dom/node_modules/dom-accessibility-api": {
|
"node_modules/@testing-library/jest-dom/node_modules/dom-accessibility-api": {
|
||||||
"version": "0.6.3",
|
"version": "0.6.3",
|
||||||
"resolved": "https://registry.npmjs.org/dom-accessibility-api/-/dom-accessibility-api-0.6.3.tgz",
|
"resolved": "https://registry.npmjs.org/dom-accessibility-api/-/dom-accessibility-api-0.6.3.tgz",
|
||||||
|
|
@ -6661,14 +6646,14 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/eslint-plugin-prettier": {
|
"node_modules/eslint-plugin-prettier": {
|
||||||
"version": "5.4.0",
|
"version": "5.5.4",
|
||||||
"resolved": "https://registry.npmjs.org/eslint-plugin-prettier/-/eslint-plugin-prettier-5.4.0.tgz",
|
"resolved": "https://registry.npmjs.org/eslint-plugin-prettier/-/eslint-plugin-prettier-5.5.4.tgz",
|
||||||
"integrity": "sha512-BvQOvUhkVQM1i63iMETK9Hjud9QhqBnbtT1Zc642p9ynzBuCe5pybkOnvqZIBypXmMlsGcnU4HZ8sCTPfpAexA==",
|
"integrity": "sha512-swNtI95SToIz05YINMA6Ox5R057IMAmWZ26GqPxusAp1TZzj+IdY9tXNWWD3vkF/wEqydCONcwjTFpxybBqZsg==",
|
||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"prettier-linter-helpers": "^1.0.0",
|
"prettier-linter-helpers": "^1.0.0",
|
||||||
"synckit": "^0.11.0"
|
"synckit": "^0.11.7"
|
||||||
},
|
},
|
||||||
"engines": {
|
"engines": {
|
||||||
"node": "^14.18.0 || >=16.0.0"
|
"node": "^14.18.0 || >=16.0.0"
|
||||||
|
|
@ -10021,9 +10006,9 @@
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/llama-stack-client": {
|
"node_modules/llama-stack-client": {
|
||||||
"version": "0.2.18",
|
"version": "0.2.19",
|
||||||
"resolved": "https://registry.npmjs.org/llama-stack-client/-/llama-stack-client-0.2.18.tgz",
|
"resolved": "https://registry.npmjs.org/llama-stack-client/-/llama-stack-client-0.2.19.tgz",
|
||||||
"integrity": "sha512-k+xQOz/TIU0cINP4Aih8q6xs7f/6qs0fLDMXTTKQr5C0F1jtCjRiwsas7bTsDfpKfYhg/7Xy/wPw/uZgi6aIVg==",
|
"integrity": "sha512-sDuAhUdEGlERZ3jlMUzPXcQTgMv/pGbDrPX0ifbE5S+gr7Q+7ohuQYrIXe+hXgIipFjq+y4b2c5laZ76tmAyEA==",
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@types/node": "^18.11.18",
|
"@types/node": "^18.11.18",
|
||||||
|
|
@ -10066,13 +10051,6 @@
|
||||||
"url": "https://github.com/sponsors/sindresorhus"
|
"url": "https://github.com/sponsors/sindresorhus"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/lodash": {
|
|
||||||
"version": "4.17.21",
|
|
||||||
"resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
|
|
||||||
"integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==",
|
|
||||||
"dev": true,
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/lodash.merge": {
|
"node_modules/lodash.merge": {
|
||||||
"version": "4.6.2",
|
"version": "4.6.2",
|
||||||
"resolved": "https://registry.npmjs.org/lodash.merge/-/lodash.merge-4.6.2.tgz",
|
"resolved": "https://registry.npmjs.org/lodash.merge/-/lodash.merge-4.6.2.tgz",
|
||||||
|
|
@ -12602,9 +12580,9 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/remeda": {
|
"node_modules/remeda": {
|
||||||
"version": "2.26.1",
|
"version": "2.30.0",
|
||||||
"resolved": "https://registry.npmjs.org/remeda/-/remeda-2.26.1.tgz",
|
"resolved": "https://registry.npmjs.org/remeda/-/remeda-2.30.0.tgz",
|
||||||
"integrity": "sha512-hpiLfhUwkJhiMS3Z7dRrygcRdkMRZASw5qUdNdi33x1/Y9y/J5q5TyLyf8btDoVLIcsg/4fzPdaGXDTbnl+ixw==",
|
"integrity": "sha512-TcRpI1ecqnMer3jHhFtMerGvHFCDlCHljUp0/9A4HxHOh5bSY3kP1l8nQDFMnWYJKl3MSarDNY1tb0Bs/bCmvw==",
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"type-fest": "^4.41.0"
|
"type-fest": "^4.41.0"
|
||||||
|
|
@ -13567,14 +13545,13 @@
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/synckit": {
|
"node_modules/synckit": {
|
||||||
"version": "0.11.5",
|
"version": "0.11.11",
|
||||||
"resolved": "https://registry.npmjs.org/synckit/-/synckit-0.11.5.tgz",
|
"resolved": "https://registry.npmjs.org/synckit/-/synckit-0.11.11.tgz",
|
||||||
"integrity": "sha512-frqvfWyDA5VPVdrWfH24uM6SI/O8NLpVbIIJxb8t/a3YGsp4AW9CYgSKC0OaSEfexnp7Y1pVh2Y6IHO8ggGDmA==",
|
"integrity": "sha512-MeQTA1r0litLUf0Rp/iisCaL8761lKAZHaimlbGK4j0HysC4PLfqygQj9srcs0m2RdtDYnF8UuYyKpbjHYp7Jw==",
|
||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@pkgr/core": "^0.2.4",
|
"@pkgr/core": "^0.2.9"
|
||||||
"tslib": "^2.8.1"
|
|
||||||
},
|
},
|
||||||
"engines": {
|
"engines": {
|
||||||
"node": "^14.18.0 || >=16.0.0"
|
"node": "^14.18.0 || >=16.0.0"
|
||||||
|
|
|
||||||
|
|
@ -23,7 +23,7 @@
|
||||||
"class-variance-authority": "^0.7.1",
|
"class-variance-authority": "^0.7.1",
|
||||||
"clsx": "^2.1.1",
|
"clsx": "^2.1.1",
|
||||||
"framer-motion": "^11.18.2",
|
"framer-motion": "^11.18.2",
|
||||||
"llama-stack-client": "^0.2.18",
|
"llama-stack-client": "^0.2.19",
|
||||||
"lucide-react": "^0.510.0",
|
"lucide-react": "^0.510.0",
|
||||||
"next": "15.3.3",
|
"next": "15.3.3",
|
||||||
"next-auth": "^4.24.11",
|
"next-auth": "^4.24.11",
|
||||||
|
|
@ -32,7 +32,7 @@
|
||||||
"react-dom": "^19.0.0",
|
"react-dom": "^19.0.0",
|
||||||
"react-markdown": "^10.1.0",
|
"react-markdown": "^10.1.0",
|
||||||
"remark-gfm": "^4.0.1",
|
"remark-gfm": "^4.0.1",
|
||||||
"remeda": "^2.26.1",
|
"remeda": "^2.30.0",
|
||||||
"shiki": "^1.29.2",
|
"shiki": "^1.29.2",
|
||||||
"sonner": "^2.0.6",
|
"sonner": "^2.0.6",
|
||||||
"tailwind-merge": "^3.3.1"
|
"tailwind-merge": "^3.3.1"
|
||||||
|
|
@ -40,8 +40,8 @@
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@eslint/eslintrc": "^3",
|
"@eslint/eslintrc": "^3",
|
||||||
"@tailwindcss/postcss": "^4",
|
"@tailwindcss/postcss": "^4",
|
||||||
"@testing-library/dom": "^10.4.0",
|
"@testing-library/dom": "^10.4.1",
|
||||||
"@testing-library/jest-dom": "^6.6.3",
|
"@testing-library/jest-dom": "^6.8.0",
|
||||||
"@testing-library/react": "^16.3.0",
|
"@testing-library/react": "^16.3.0",
|
||||||
"@types/jest": "^29.5.14",
|
"@types/jest": "^29.5.14",
|
||||||
"@types/node": "^20",
|
"@types/node": "^20",
|
||||||
|
|
@ -50,7 +50,7 @@
|
||||||
"eslint": "^9",
|
"eslint": "^9",
|
||||||
"eslint-config-next": "15.3.2",
|
"eslint-config-next": "15.3.2",
|
||||||
"eslint-config-prettier": "^10.1.8",
|
"eslint-config-prettier": "^10.1.8",
|
||||||
"eslint-plugin-prettier": "^5.4.0",
|
"eslint-plugin-prettier": "^5.5.4",
|
||||||
"jest": "^29.7.0",
|
"jest": "^29.7.0",
|
||||||
"jest-environment-jsdom": "^29.7.0",
|
"jest-environment-jsdom": "^29.7.0",
|
||||||
"prettier": "3.5.3",
|
"prettier": "3.5.3",
|
||||||
|
|
|
||||||
|
|
@ -7,7 +7,7 @@ required-version = ">=0.7.0"
|
||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "llama_stack"
|
name = "llama_stack"
|
||||||
version = "0.2.18"
|
version = "0.2.19"
|
||||||
authors = [{ name = "Meta Llama", email = "llama-oss@meta.com" }]
|
authors = [{ name = "Meta Llama", email = "llama-oss@meta.com" }]
|
||||||
description = "Llama Stack"
|
description = "Llama Stack"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
|
|
@ -31,7 +31,7 @@ dependencies = [
|
||||||
"huggingface-hub>=0.34.0,<1.0",
|
"huggingface-hub>=0.34.0,<1.0",
|
||||||
"jinja2>=3.1.6",
|
"jinja2>=3.1.6",
|
||||||
"jsonschema",
|
"jsonschema",
|
||||||
"llama-stack-client>=0.2.18",
|
"llama-stack-client>=0.2.19",
|
||||||
"llama-api-client>=0.1.2",
|
"llama-api-client>=0.1.2",
|
||||||
"openai>=1.99.6,<1.100.0",
|
"openai>=1.99.6,<1.100.0",
|
||||||
"prompt-toolkit",
|
"prompt-toolkit",
|
||||||
|
|
@ -56,7 +56,7 @@ dependencies = [
|
||||||
ui = [
|
ui = [
|
||||||
"streamlit",
|
"streamlit",
|
||||||
"pandas",
|
"pandas",
|
||||||
"llama-stack-client>=0.2.18",
|
"llama-stack-client>=0.2.19",
|
||||||
"streamlit-option-menu",
|
"streamlit-option-menu",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
|
||||||
Binary file not shown.
|
|
@ -47,34 +47,45 @@ def client_with_empty_registry(client_with_models):
|
||||||
|
|
||||||
|
|
||||||
def test_vector_db_retrieve(client_with_empty_registry, embedding_model_id, embedding_dimension):
|
def test_vector_db_retrieve(client_with_empty_registry, embedding_model_id, embedding_dimension):
|
||||||
# Register a memory bank first
|
vector_db_name = "test_vector_db"
|
||||||
vector_db_id = "test_vector_db"
|
register_response = client_with_empty_registry.vector_dbs.register(
|
||||||
client_with_empty_registry.vector_dbs.register(
|
vector_db_id=vector_db_name,
|
||||||
vector_db_id=vector_db_id,
|
|
||||||
embedding_model=embedding_model_id,
|
embedding_model=embedding_model_id,
|
||||||
embedding_dimension=embedding_dimension,
|
embedding_dimension=embedding_dimension,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
actual_vector_db_id = register_response.identifier
|
||||||
|
|
||||||
# Retrieve the memory bank and validate its properties
|
# Retrieve the memory bank and validate its properties
|
||||||
response = client_with_empty_registry.vector_dbs.retrieve(vector_db_id=vector_db_id)
|
response = client_with_empty_registry.vector_dbs.retrieve(vector_db_id=actual_vector_db_id)
|
||||||
assert response is not None
|
assert response is not None
|
||||||
assert response.identifier == vector_db_id
|
assert response.identifier == actual_vector_db_id
|
||||||
assert response.embedding_model == embedding_model_id
|
assert response.embedding_model == embedding_model_id
|
||||||
assert response.provider_resource_id == vector_db_id
|
assert response.identifier.startswith("vs_")
|
||||||
|
|
||||||
|
|
||||||
def test_vector_db_register(client_with_empty_registry, embedding_model_id, embedding_dimension):
|
def test_vector_db_register(client_with_empty_registry, embedding_model_id, embedding_dimension):
|
||||||
vector_db_id = "test_vector_db"
|
vector_db_name = "test_vector_db"
|
||||||
client_with_empty_registry.vector_dbs.register(
|
response = client_with_empty_registry.vector_dbs.register(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=vector_db_name,
|
||||||
embedding_model=embedding_model_id,
|
embedding_model=embedding_model_id,
|
||||||
embedding_dimension=embedding_dimension,
|
embedding_dimension=embedding_dimension,
|
||||||
)
|
)
|
||||||
|
|
||||||
vector_dbs_after_register = [vector_db.identifier for vector_db in client_with_empty_registry.vector_dbs.list()]
|
actual_vector_db_id = response.identifier
|
||||||
assert vector_dbs_after_register == [vector_db_id]
|
assert actual_vector_db_id.startswith("vs_")
|
||||||
|
assert actual_vector_db_id != vector_db_name
|
||||||
|
|
||||||
client_with_empty_registry.vector_dbs.unregister(vector_db_id=vector_db_id)
|
vector_dbs_after_register = [vector_db.identifier for vector_db in client_with_empty_registry.vector_dbs.list()]
|
||||||
|
assert vector_dbs_after_register == [actual_vector_db_id]
|
||||||
|
|
||||||
|
vector_stores = client_with_empty_registry.vector_stores.list()
|
||||||
|
assert len(vector_stores.data) == 1
|
||||||
|
vector_store = vector_stores.data[0]
|
||||||
|
assert vector_store.id == actual_vector_db_id
|
||||||
|
assert vector_store.name == vector_db_name
|
||||||
|
|
||||||
|
client_with_empty_registry.vector_dbs.unregister(vector_db_id=actual_vector_db_id)
|
||||||
|
|
||||||
vector_dbs = [vector_db.identifier for vector_db in client_with_empty_registry.vector_dbs.list()]
|
vector_dbs = [vector_db.identifier for vector_db in client_with_empty_registry.vector_dbs.list()]
|
||||||
assert len(vector_dbs) == 0
|
assert len(vector_dbs) == 0
|
||||||
|
|
@ -91,20 +102,22 @@ def test_vector_db_register(client_with_empty_registry, embedding_model_id, embe
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
def test_insert_chunks(client_with_empty_registry, embedding_model_id, embedding_dimension, sample_chunks, test_case):
|
def test_insert_chunks(client_with_empty_registry, embedding_model_id, embedding_dimension, sample_chunks, test_case):
|
||||||
vector_db_id = "test_vector_db"
|
vector_db_name = "test_vector_db"
|
||||||
client_with_empty_registry.vector_dbs.register(
|
register_response = client_with_empty_registry.vector_dbs.register(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=vector_db_name,
|
||||||
embedding_model=embedding_model_id,
|
embedding_model=embedding_model_id,
|
||||||
embedding_dimension=embedding_dimension,
|
embedding_dimension=embedding_dimension,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
actual_vector_db_id = register_response.identifier
|
||||||
|
|
||||||
client_with_empty_registry.vector_io.insert(
|
client_with_empty_registry.vector_io.insert(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
chunks=sample_chunks,
|
chunks=sample_chunks,
|
||||||
)
|
)
|
||||||
|
|
||||||
response = client_with_empty_registry.vector_io.query(
|
response = client_with_empty_registry.vector_io.query(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
query="What is the capital of France?",
|
query="What is the capital of France?",
|
||||||
)
|
)
|
||||||
assert response is not None
|
assert response is not None
|
||||||
|
|
@ -113,7 +126,7 @@ def test_insert_chunks(client_with_empty_registry, embedding_model_id, embedding
|
||||||
|
|
||||||
query, expected_doc_id = test_case
|
query, expected_doc_id = test_case
|
||||||
response = client_with_empty_registry.vector_io.query(
|
response = client_with_empty_registry.vector_io.query(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
query=query,
|
query=query,
|
||||||
)
|
)
|
||||||
assert response is not None
|
assert response is not None
|
||||||
|
|
@ -128,13 +141,15 @@ def test_insert_chunks_with_precomputed_embeddings(client_with_empty_registry, e
|
||||||
"remote::qdrant": {"score_threshold": -1.0},
|
"remote::qdrant": {"score_threshold": -1.0},
|
||||||
"inline::qdrant": {"score_threshold": -1.0},
|
"inline::qdrant": {"score_threshold": -1.0},
|
||||||
}
|
}
|
||||||
vector_db_id = "test_precomputed_embeddings_db"
|
vector_db_name = "test_precomputed_embeddings_db"
|
||||||
client_with_empty_registry.vector_dbs.register(
|
register_response = client_with_empty_registry.vector_dbs.register(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=vector_db_name,
|
||||||
embedding_model=embedding_model_id,
|
embedding_model=embedding_model_id,
|
||||||
embedding_dimension=embedding_dimension,
|
embedding_dimension=embedding_dimension,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
actual_vector_db_id = register_response.identifier
|
||||||
|
|
||||||
chunks_with_embeddings = [
|
chunks_with_embeddings = [
|
||||||
Chunk(
|
Chunk(
|
||||||
content="This is a test chunk with precomputed embedding.",
|
content="This is a test chunk with precomputed embedding.",
|
||||||
|
|
@ -144,13 +159,13 @@ def test_insert_chunks_with_precomputed_embeddings(client_with_empty_registry, e
|
||||||
]
|
]
|
||||||
|
|
||||||
client_with_empty_registry.vector_io.insert(
|
client_with_empty_registry.vector_io.insert(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
chunks=chunks_with_embeddings,
|
chunks=chunks_with_embeddings,
|
||||||
)
|
)
|
||||||
|
|
||||||
provider = [p.provider_id for p in client_with_empty_registry.providers.list() if p.api == "vector_io"][0]
|
provider = [p.provider_id for p in client_with_empty_registry.providers.list() if p.api == "vector_io"][0]
|
||||||
response = client_with_empty_registry.vector_io.query(
|
response = client_with_empty_registry.vector_io.query(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
query="precomputed embedding test",
|
query="precomputed embedding test",
|
||||||
params=vector_io_provider_params_dict.get(provider, None),
|
params=vector_io_provider_params_dict.get(provider, None),
|
||||||
)
|
)
|
||||||
|
|
@ -173,13 +188,15 @@ def test_query_returns_valid_object_when_identical_to_embedding_in_vdb(
|
||||||
"remote::qdrant": {"score_threshold": 0.0},
|
"remote::qdrant": {"score_threshold": 0.0},
|
||||||
"inline::qdrant": {"score_threshold": 0.0},
|
"inline::qdrant": {"score_threshold": 0.0},
|
||||||
}
|
}
|
||||||
vector_db_id = "test_precomputed_embeddings_db"
|
vector_db_name = "test_precomputed_embeddings_db"
|
||||||
client_with_empty_registry.vector_dbs.register(
|
register_response = client_with_empty_registry.vector_dbs.register(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=vector_db_name,
|
||||||
embedding_model=embedding_model_id,
|
embedding_model=embedding_model_id,
|
||||||
embedding_dimension=embedding_dimension,
|
embedding_dimension=embedding_dimension,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
actual_vector_db_id = register_response.identifier
|
||||||
|
|
||||||
chunks_with_embeddings = [
|
chunks_with_embeddings = [
|
||||||
Chunk(
|
Chunk(
|
||||||
content="duplicate",
|
content="duplicate",
|
||||||
|
|
@ -189,13 +206,13 @@ def test_query_returns_valid_object_when_identical_to_embedding_in_vdb(
|
||||||
]
|
]
|
||||||
|
|
||||||
client_with_empty_registry.vector_io.insert(
|
client_with_empty_registry.vector_io.insert(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
chunks=chunks_with_embeddings,
|
chunks=chunks_with_embeddings,
|
||||||
)
|
)
|
||||||
|
|
||||||
provider = [p.provider_id for p in client_with_empty_registry.providers.list() if p.api == "vector_io"][0]
|
provider = [p.provider_id for p in client_with_empty_registry.providers.list() if p.api == "vector_io"][0]
|
||||||
response = client_with_empty_registry.vector_io.query(
|
response = client_with_empty_registry.vector_io.query(
|
||||||
vector_db_id=vector_db_id,
|
vector_db_id=actual_vector_db_id,
|
||||||
query="duplicate",
|
query="duplicate",
|
||||||
params=vector_io_provider_params_dict.get(provider, None),
|
params=vector_io_provider_params_dict.get(provider, None),
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -146,6 +146,20 @@ class VectorDBImpl(Impl):
|
||||||
async def unregister_vector_db(self, vector_db_id: str):
|
async def unregister_vector_db(self, vector_db_id: str):
|
||||||
return vector_db_id
|
return vector_db_id
|
||||||
|
|
||||||
|
async def openai_create_vector_store(self, **kwargs):
|
||||||
|
import time
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
from llama_stack.apis.vector_io.vector_io import VectorStoreFileCounts, VectorStoreObject
|
||||||
|
|
||||||
|
vector_store_id = kwargs.get("provider_vector_db_id") or f"vs_{uuid.uuid4()}"
|
||||||
|
return VectorStoreObject(
|
||||||
|
id=vector_store_id,
|
||||||
|
name=kwargs.get("name", vector_store_id),
|
||||||
|
created_at=int(time.time()),
|
||||||
|
file_counts=VectorStoreFileCounts(completed=0, cancelled=0, failed=0, in_progress=0, total=0),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
async def test_models_routing_table(cached_disk_dist_registry):
|
async def test_models_routing_table(cached_disk_dist_registry):
|
||||||
table = ModelsRoutingTable({"test_provider": InferenceImpl()}, cached_disk_dist_registry, {})
|
table = ModelsRoutingTable({"test_provider": InferenceImpl()}, cached_disk_dist_registry, {})
|
||||||
|
|
@ -247,17 +261,21 @@ async def test_vectordbs_routing_table(cached_disk_dist_registry):
|
||||||
)
|
)
|
||||||
|
|
||||||
# Register multiple vector databases and verify listing
|
# Register multiple vector databases and verify listing
|
||||||
await table.register_vector_db(vector_db_id="test-vectordb", embedding_model="test_provider/test-model")
|
vdb1 = await table.register_vector_db(vector_db_id="test-vectordb", embedding_model="test_provider/test-model")
|
||||||
await table.register_vector_db(vector_db_id="test-vectordb-2", embedding_model="test_provider/test-model")
|
vdb2 = await table.register_vector_db(vector_db_id="test-vectordb-2", embedding_model="test_provider/test-model")
|
||||||
vector_dbs = await table.list_vector_dbs()
|
vector_dbs = await table.list_vector_dbs()
|
||||||
|
|
||||||
assert len(vector_dbs.data) == 2
|
assert len(vector_dbs.data) == 2
|
||||||
vector_db_ids = {v.identifier for v in vector_dbs.data}
|
vector_db_ids = {v.identifier for v in vector_dbs.data}
|
||||||
assert "test-vectordb" in vector_db_ids
|
assert vdb1.identifier in vector_db_ids
|
||||||
assert "test-vectordb-2" in vector_db_ids
|
assert vdb2.identifier in vector_db_ids
|
||||||
|
|
||||||
await table.unregister_vector_db(vector_db_id="test-vectordb")
|
# Verify they have UUID-based identifiers
|
||||||
await table.unregister_vector_db(vector_db_id="test-vectordb-2")
|
assert vdb1.identifier.startswith("vs_")
|
||||||
|
assert vdb2.identifier.startswith("vs_")
|
||||||
|
|
||||||
|
await table.unregister_vector_db(vector_db_id=vdb1.identifier)
|
||||||
|
await table.unregister_vector_db(vector_db_id=vdb2.identifier)
|
||||||
|
|
||||||
vector_dbs = await table.list_vector_dbs()
|
vector_dbs = await table.list_vector_dbs()
|
||||||
assert len(vector_dbs.data) == 0
|
assert len(vector_dbs.data) == 0
|
||||||
|
|
|
||||||
|
|
@ -7,6 +7,7 @@
|
||||||
# Unit tests for the routing tables vector_dbs
|
# Unit tests for the routing tables vector_dbs
|
||||||
|
|
||||||
import time
|
import time
|
||||||
|
import uuid
|
||||||
from unittest.mock import AsyncMock
|
from unittest.mock import AsyncMock
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
@ -34,6 +35,7 @@ from tests.unit.distribution.routers.test_routing_tables import Impl, InferenceI
|
||||||
class VectorDBImpl(Impl):
|
class VectorDBImpl(Impl):
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
super().__init__(Api.vector_io)
|
super().__init__(Api.vector_io)
|
||||||
|
self.vector_stores = {}
|
||||||
|
|
||||||
async def register_vector_db(self, vector_db: VectorDB):
|
async def register_vector_db(self, vector_db: VectorDB):
|
||||||
return vector_db
|
return vector_db
|
||||||
|
|
@ -114,8 +116,35 @@ class VectorDBImpl(Impl):
|
||||||
async def openai_delete_vector_store_file(self, vector_store_id, file_id):
|
async def openai_delete_vector_store_file(self, vector_store_id, file_id):
|
||||||
return VectorStoreFileDeleteResponse(id=file_id, deleted=True)
|
return VectorStoreFileDeleteResponse(id=file_id, deleted=True)
|
||||||
|
|
||||||
|
async def openai_create_vector_store(
|
||||||
|
self,
|
||||||
|
name=None,
|
||||||
|
embedding_model=None,
|
||||||
|
embedding_dimension=None,
|
||||||
|
provider_id=None,
|
||||||
|
provider_vector_db_id=None,
|
||||||
|
**kwargs,
|
||||||
|
):
|
||||||
|
vector_store_id = provider_vector_db_id or f"vs_{uuid.uuid4()}"
|
||||||
|
vector_store = VectorStoreObject(
|
||||||
|
id=vector_store_id,
|
||||||
|
name=name or vector_store_id,
|
||||||
|
created_at=int(time.time()),
|
||||||
|
file_counts=VectorStoreFileCounts(completed=0, cancelled=0, failed=0, in_progress=0, total=0),
|
||||||
|
)
|
||||||
|
self.vector_stores[vector_store_id] = vector_store
|
||||||
|
return vector_store
|
||||||
|
|
||||||
|
async def openai_list_vector_stores(self, **kwargs):
|
||||||
|
from llama_stack.apis.vector_io.vector_io import VectorStoreListResponse
|
||||||
|
|
||||||
|
return VectorStoreListResponse(
|
||||||
|
data=list(self.vector_stores.values()), has_more=False, first_id=None, last_id=None
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
async def test_vectordbs_routing_table(cached_disk_dist_registry):
|
async def test_vectordbs_routing_table(cached_disk_dist_registry):
|
||||||
|
n = 10
|
||||||
table = VectorDBsRoutingTable({"test_provider": VectorDBImpl()}, cached_disk_dist_registry, {})
|
table = VectorDBsRoutingTable({"test_provider": VectorDBImpl()}, cached_disk_dist_registry, {})
|
||||||
await table.initialize()
|
await table.initialize()
|
||||||
|
|
||||||
|
|
@ -129,22 +158,98 @@ async def test_vectordbs_routing_table(cached_disk_dist_registry):
|
||||||
)
|
)
|
||||||
|
|
||||||
# Register multiple vector databases and verify listing
|
# Register multiple vector databases and verify listing
|
||||||
await table.register_vector_db(vector_db_id="test-vectordb", embedding_model="test-model")
|
vdb_dict = {}
|
||||||
await table.register_vector_db(vector_db_id="test-vectordb-2", embedding_model="test-model")
|
for i in range(n):
|
||||||
|
vdb_dict[i] = await table.register_vector_db(vector_db_id=f"test-vectordb-{i}", embedding_model="test-model")
|
||||||
|
|
||||||
vector_dbs = await table.list_vector_dbs()
|
vector_dbs = await table.list_vector_dbs()
|
||||||
|
|
||||||
assert len(vector_dbs.data) == 2
|
assert len(vector_dbs.data) == len(vdb_dict)
|
||||||
vector_db_ids = {v.identifier for v in vector_dbs.data}
|
vector_db_ids = {v.identifier for v in vector_dbs.data}
|
||||||
assert "test-vectordb" in vector_db_ids
|
for k in vdb_dict:
|
||||||
assert "test-vectordb-2" in vector_db_ids
|
assert vdb_dict[k].identifier in vector_db_ids
|
||||||
|
for k in vdb_dict:
|
||||||
await table.unregister_vector_db(vector_db_id="test-vectordb")
|
await table.unregister_vector_db(vector_db_id=vdb_dict[k].identifier)
|
||||||
await table.unregister_vector_db(vector_db_id="test-vectordb-2")
|
|
||||||
|
|
||||||
vector_dbs = await table.list_vector_dbs()
|
vector_dbs = await table.list_vector_dbs()
|
||||||
assert len(vector_dbs.data) == 0
|
assert len(vector_dbs.data) == 0
|
||||||
|
|
||||||
|
|
||||||
|
async def test_vector_db_and_vector_store_id_mapping(cached_disk_dist_registry):
|
||||||
|
n = 10
|
||||||
|
impl = VectorDBImpl()
|
||||||
|
table = VectorDBsRoutingTable({"test_provider": impl}, cached_disk_dist_registry, {})
|
||||||
|
await table.initialize()
|
||||||
|
|
||||||
|
m_table = ModelsRoutingTable({"test_provider": InferenceImpl()}, cached_disk_dist_registry, {})
|
||||||
|
await m_table.initialize()
|
||||||
|
await m_table.register_model(
|
||||||
|
model_id="test-model",
|
||||||
|
provider_id="test_provider",
|
||||||
|
metadata={"embedding_dimension": 128},
|
||||||
|
model_type=ModelType.embedding,
|
||||||
|
)
|
||||||
|
|
||||||
|
vdb_dict = {}
|
||||||
|
for i in range(n):
|
||||||
|
vdb_dict[i] = await table.register_vector_db(vector_db_id=f"test-vectordb-{i}", embedding_model="test-model")
|
||||||
|
|
||||||
|
vector_dbs = await table.list_vector_dbs()
|
||||||
|
vector_db_ids = {v.identifier for v in vector_dbs.data}
|
||||||
|
|
||||||
|
vector_stores = await impl.openai_list_vector_stores()
|
||||||
|
vector_store_ids = {v.id for v in vector_stores.data}
|
||||||
|
|
||||||
|
assert vector_db_ids == vector_store_ids, (
|
||||||
|
f"Vector DB IDs {vector_db_ids} don't match vector store IDs {vector_store_ids}"
|
||||||
|
)
|
||||||
|
|
||||||
|
for vector_store in vector_stores.data:
|
||||||
|
vector_db = await table.get_vector_db(vector_store.id)
|
||||||
|
assert vector_store.name == vector_db.vector_db_name, (
|
||||||
|
f"Vector store name {vector_store.name} doesn't match vector store ID {vector_store.id}"
|
||||||
|
)
|
||||||
|
|
||||||
|
for vector_db_id in vector_db_ids:
|
||||||
|
await table.unregister_vector_db(vector_db_id)
|
||||||
|
|
||||||
|
assert len((await table.list_vector_dbs()).data) == 0
|
||||||
|
|
||||||
|
|
||||||
|
async def test_vector_db_id_becomes_vector_store_name(cached_disk_dist_registry):
|
||||||
|
impl = VectorDBImpl()
|
||||||
|
table = VectorDBsRoutingTable({"test_provider": impl}, cached_disk_dist_registry, {})
|
||||||
|
await table.initialize()
|
||||||
|
|
||||||
|
m_table = ModelsRoutingTable({"test_provider": InferenceImpl()}, cached_disk_dist_registry, {})
|
||||||
|
await m_table.initialize()
|
||||||
|
await m_table.register_model(
|
||||||
|
model_id="test-model",
|
||||||
|
provider_id="test_provider",
|
||||||
|
metadata={"embedding_dimension": 128},
|
||||||
|
model_type=ModelType.embedding,
|
||||||
|
)
|
||||||
|
|
||||||
|
user_provided_id = "my-custom-vector-db"
|
||||||
|
await table.register_vector_db(vector_db_id=user_provided_id, embedding_model="test-model")
|
||||||
|
|
||||||
|
vector_stores = await impl.openai_list_vector_stores()
|
||||||
|
assert len(vector_stores.data) == 1
|
||||||
|
|
||||||
|
vector_store = vector_stores.data[0]
|
||||||
|
|
||||||
|
assert vector_store.name == user_provided_id
|
||||||
|
|
||||||
|
assert vector_store.id.startswith("vs_")
|
||||||
|
assert vector_store.id != user_provided_id
|
||||||
|
|
||||||
|
vector_dbs = await table.list_vector_dbs()
|
||||||
|
assert len(vector_dbs.data) == 1
|
||||||
|
assert vector_dbs.data[0].identifier == vector_store.id
|
||||||
|
|
||||||
|
await table.unregister_vector_db(vector_store.id)
|
||||||
|
|
||||||
|
|
||||||
async def test_openai_vector_stores_routing_table_roles(cached_disk_dist_registry):
|
async def test_openai_vector_stores_routing_table_roles(cached_disk_dist_registry):
|
||||||
impl = VectorDBImpl()
|
impl = VectorDBImpl()
|
||||||
impl.openai_retrieve_vector_store = AsyncMock(return_value="OK")
|
impl.openai_retrieve_vector_store = AsyncMock(return_value="OK")
|
||||||
|
|
@ -164,7 +269,8 @@ async def test_openai_vector_stores_routing_table_roles(cached_disk_dist_registr
|
||||||
|
|
||||||
authorized_user = User(principal="alice", attributes={"roles": [authorized_team]})
|
authorized_user = User(principal="alice", attributes={"roles": [authorized_team]})
|
||||||
with request_provider_data_context({}, authorized_user):
|
with request_provider_data_context({}, authorized_user):
|
||||||
_ = await table.register_vector_db(vector_db_id="vs1", embedding_model="test-model")
|
registered_vdb = await table.register_vector_db(vector_db_id="vs1", embedding_model="test-model")
|
||||||
|
authorized_table = registered_vdb.identifier # Use the actual generated ID
|
||||||
|
|
||||||
# Authorized reader
|
# Authorized reader
|
||||||
with request_provider_data_context({}, authorized_user):
|
with request_provider_data_context({}, authorized_user):
|
||||||
|
|
@ -227,7 +333,8 @@ async def test_openai_vector_stores_routing_table_actions(cached_disk_dist_regis
|
||||||
)
|
)
|
||||||
|
|
||||||
with request_provider_data_context({}, admin_user):
|
with request_provider_data_context({}, admin_user):
|
||||||
await table.register_vector_db(vector_db_id=vector_db_id, embedding_model="test-model")
|
registered_vdb = await table.register_vector_db(vector_db_id=vector_db_id, embedding_model="test-model")
|
||||||
|
vector_db_id = registered_vdb.identifier # Use the actual generated ID
|
||||||
|
|
||||||
read_methods = [
|
read_methods = [
|
||||||
(table.openai_retrieve_vector_store, (vector_db_id,), {}),
|
(table.openai_retrieve_vector_store, (vector_db_id,), {}),
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,6 @@
|
||||||
# This source code is licensed under the terms described in the LICENSE file in
|
# This source code is licensed under the terms described in the LICENSE file in
|
||||||
# the root directory of this source tree.
|
# the root directory of this source tree.
|
||||||
|
|
||||||
import sqlite3
|
|
||||||
import tempfile
|
import tempfile
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from unittest.mock import patch
|
from unittest.mock import patch
|
||||||
|
|
@ -133,7 +132,6 @@ class TestInferenceRecording:
|
||||||
# Test directory creation
|
# Test directory creation
|
||||||
assert storage.test_dir.exists()
|
assert storage.test_dir.exists()
|
||||||
assert storage.responses_dir.exists()
|
assert storage.responses_dir.exists()
|
||||||
assert storage.db_path.exists()
|
|
||||||
|
|
||||||
# Test storing and retrieving a recording
|
# Test storing and retrieving a recording
|
||||||
request_hash = "test_hash_123"
|
request_hash = "test_hash_123"
|
||||||
|
|
@ -147,15 +145,6 @@ class TestInferenceRecording:
|
||||||
|
|
||||||
storage.store_recording(request_hash, request_data, response_data)
|
storage.store_recording(request_hash, request_data, response_data)
|
||||||
|
|
||||||
# Verify SQLite record
|
|
||||||
with sqlite3.connect(storage.db_path) as conn:
|
|
||||||
result = conn.execute("SELECT * FROM recordings WHERE request_hash = ?", (request_hash,)).fetchone()
|
|
||||||
|
|
||||||
assert result is not None
|
|
||||||
assert result[0] == request_hash # request_hash
|
|
||||||
assert result[2] == "/v1/chat/completions" # endpoint
|
|
||||||
assert result[3] == "llama3.2:3b" # model
|
|
||||||
|
|
||||||
# Verify file storage and retrieval
|
# Verify file storage and retrieval
|
||||||
retrieved = storage.find_recording(request_hash)
|
retrieved = storage.find_recording(request_hash)
|
||||||
assert retrieved is not None
|
assert retrieved is not None
|
||||||
|
|
@ -185,10 +174,7 @@ class TestInferenceRecording:
|
||||||
|
|
||||||
# Verify recording was stored
|
# Verify recording was stored
|
||||||
storage = ResponseStorage(temp_storage_dir)
|
storage = ResponseStorage(temp_storage_dir)
|
||||||
with sqlite3.connect(storage.db_path) as conn:
|
assert storage.responses_dir.exists()
|
||||||
recordings = conn.execute("SELECT COUNT(*) FROM recordings").fetchone()[0]
|
|
||||||
|
|
||||||
assert recordings == 1
|
|
||||||
|
|
||||||
async def test_replay_mode(self, temp_storage_dir, real_openai_chat_response):
|
async def test_replay_mode(self, temp_storage_dir, real_openai_chat_response):
|
||||||
"""Test that replay mode returns stored responses without making real calls."""
|
"""Test that replay mode returns stored responses without making real calls."""
|
||||||
|
|
|
||||||
|
|
@ -88,3 +88,10 @@ def test_nested_structures(setup_env_vars):
|
||||||
}
|
}
|
||||||
expected = {"key1": "test_value", "key2": ["default", "conditional"], "key3": {"nested": None}}
|
expected = {"key1": "test_value", "key2": ["default", "conditional"], "key3": {"nested": None}}
|
||||||
assert replace_env_vars(data) == expected
|
assert replace_env_vars(data) == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_strings_preserved(setup_env_vars):
|
||||||
|
# Explicit strings that look like numbers/booleans should remain strings
|
||||||
|
data = {"port": "8080", "enabled": "true", "count": "123", "ratio": "3.14"}
|
||||||
|
expected = {"port": "8080", "enabled": "true", "count": "123", "ratio": "3.14"}
|
||||||
|
assert replace_env_vars(data) == expected
|
||||||
|
|
|
||||||
68
uv.lock
generated
68
uv.lock
generated
|
|
@ -1128,6 +1128,9 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/4f/72/dcbc6dbf838549b7b0c2c18c1365d2580eb7456939e4b608c3ab213fce78/geventhttpclient-2.3.4-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:9ac30c38d86d888b42bb2ab2738ab9881199609e9fa9a153eb0c66fc9188c6cb", size = 71984, upload-time = "2025-06-11T13:17:09.126Z" },
|
{ url = "https://files.pythonhosted.org/packages/4f/72/dcbc6dbf838549b7b0c2c18c1365d2580eb7456939e4b608c3ab213fce78/geventhttpclient-2.3.4-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:9ac30c38d86d888b42bb2ab2738ab9881199609e9fa9a153eb0c66fc9188c6cb", size = 71984, upload-time = "2025-06-11T13:17:09.126Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/4c/f9/74aa8c556364ad39b238919c954a0da01a6154ad5e85a1d1ab5f9f5ac186/geventhttpclient-2.3.4-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:4b802000a4fad80fa57e895009671d6e8af56777e3adf0d8aee0807e96188fd9", size = 52631, upload-time = "2025-06-11T13:17:10.061Z" },
|
{ url = "https://files.pythonhosted.org/packages/4c/f9/74aa8c556364ad39b238919c954a0da01a6154ad5e85a1d1ab5f9f5ac186/geventhttpclient-2.3.4-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:4b802000a4fad80fa57e895009671d6e8af56777e3adf0d8aee0807e96188fd9", size = 52631, upload-time = "2025-06-11T13:17:10.061Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/11/1a/bc4b70cba8b46be8b2c6ca5b8067c4f086f8c90915eb68086ab40ff6243d/geventhttpclient-2.3.4-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:461e4d9f4caee481788ec95ac64e0a4a087c1964ddbfae9b6f2dc51715ba706c", size = 51991, upload-time = "2025-06-11T13:17:11.049Z" },
|
{ url = "https://files.pythonhosted.org/packages/11/1a/bc4b70cba8b46be8b2c6ca5b8067c4f086f8c90915eb68086ab40ff6243d/geventhttpclient-2.3.4-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:461e4d9f4caee481788ec95ac64e0a4a087c1964ddbfae9b6f2dc51715ba706c", size = 51991, upload-time = "2025-06-11T13:17:11.049Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/03/3f/5ce6e003b3b24f7caf3207285831afd1a4f857ce98ac45e1fb7a6815bd58/geventhttpclient-2.3.4-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:b7e41687c74e8fbe6a665458bbaea0c5a75342a95e2583738364a73bcbf1671b", size = 114982, upload-time = "2025-08-24T12:16:50.76Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/60/16/6f9dad141b7c6dd7ee831fbcd72dd02535c57bc1ec3c3282f07e72c31344/geventhttpclient-2.3.4-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c3ea5da20f4023cf40207ce15f5f4028377ffffdba3adfb60b4c8f34925fce79", size = 115654, upload-time = "2025-08-24T12:16:52.072Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ba/52/9b516a2ff423d8bd64c319e1950a165ceebb552781c5a88c1e94e93e8713/geventhttpclient-2.3.4-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:91f19a8a6899c27867dbdace9500f337d3e891a610708e86078915f1d779bf53", size = 121672, upload-time = "2025-08-24T12:16:53.361Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/b0/f5/8d0f1e998f6d933c251b51ef92d11f7eb5211e3cd579018973a2b455f7c5/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:41f2dcc0805551ea9d49f9392c3b9296505a89b9387417b148655d0d8251b36e", size = 119012, upload-time = "2025-06-11T13:17:11.956Z" },
|
{ url = "https://files.pythonhosted.org/packages/b0/f5/8d0f1e998f6d933c251b51ef92d11f7eb5211e3cd579018973a2b455f7c5/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:41f2dcc0805551ea9d49f9392c3b9296505a89b9387417b148655d0d8251b36e", size = 119012, upload-time = "2025-06-11T13:17:11.956Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ea/0e/59e4ab506b3c19fc72e88ca344d150a9028a00c400b1099637100bec26fc/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:62f3a29bf242ecca6360d497304900683fd8f42cbf1de8d0546c871819251dad", size = 124565, upload-time = "2025-06-11T13:17:12.896Z" },
|
{ url = "https://files.pythonhosted.org/packages/ea/0e/59e4ab506b3c19fc72e88ca344d150a9028a00c400b1099637100bec26fc/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:62f3a29bf242ecca6360d497304900683fd8f42cbf1de8d0546c871819251dad", size = 124565, upload-time = "2025-06-11T13:17:12.896Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/39/5d/dcbd34dfcda0c016b4970bd583cb260cc5ebfc35b33d0ec9ccdb2293587a/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:8714a3f2c093aeda3ffdb14c03571d349cb3ed1b8b461d9f321890659f4a5dbf", size = 115573, upload-time = "2025-06-11T13:17:13.937Z" },
|
{ url = "https://files.pythonhosted.org/packages/39/5d/dcbd34dfcda0c016b4970bd583cb260cc5ebfc35b33d0ec9ccdb2293587a/geventhttpclient-2.3.4-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:8714a3f2c093aeda3ffdb14c03571d349cb3ed1b8b461d9f321890659f4a5dbf", size = 115573, upload-time = "2025-06-11T13:17:13.937Z" },
|
||||||
|
|
@ -1141,6 +1144,9 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/ff/ad/132fddde6e2dca46d6a86316962437acd2bfaeb264db4e0fae83c529eb04/geventhttpclient-2.3.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:be64c5583884c407fc748dedbcb083475d5b138afb23c6bc0836cbad228402cc", size = 71967, upload-time = "2025-06-11T13:17:22.121Z" },
|
{ url = "https://files.pythonhosted.org/packages/ff/ad/132fddde6e2dca46d6a86316962437acd2bfaeb264db4e0fae83c529eb04/geventhttpclient-2.3.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:be64c5583884c407fc748dedbcb083475d5b138afb23c6bc0836cbad228402cc", size = 71967, upload-time = "2025-06-11T13:17:22.121Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/f4/34/5e77d9a31d93409a8519cf573843288565272ae5a016be9c9293f56c50a1/geventhttpclient-2.3.4-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:15b2567137734183efda18e4d6245b18772e648b6a25adea0eba8b3a8b0d17e8", size = 52632, upload-time = "2025-06-11T13:17:23.016Z" },
|
{ url = "https://files.pythonhosted.org/packages/f4/34/5e77d9a31d93409a8519cf573843288565272ae5a016be9c9293f56c50a1/geventhttpclient-2.3.4-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:15b2567137734183efda18e4d6245b18772e648b6a25adea0eba8b3a8b0d17e8", size = 52632, upload-time = "2025-06-11T13:17:23.016Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/47/d2/cf0dbc333304700e68cee9347f654b56e8b0f93a341b8b0d027ee96800d6/geventhttpclient-2.3.4-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:a4bca1151b8cd207eef6d5cb3c720c562b2aa7293cf113a68874e235cfa19c31", size = 51980, upload-time = "2025-06-11T13:17:23.933Z" },
|
{ url = "https://files.pythonhosted.org/packages/47/d2/cf0dbc333304700e68cee9347f654b56e8b0f93a341b8b0d027ee96800d6/geventhttpclient-2.3.4-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:a4bca1151b8cd207eef6d5cb3c720c562b2aa7293cf113a68874e235cfa19c31", size = 51980, upload-time = "2025-06-11T13:17:23.933Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/27/6e/049e685fc43e2e966c83f24b3187f6a6736103f0fc51118140f4ca1793d4/geventhttpclient-2.3.4-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:8a681433e2f3d4b326d8b36b3e05b787b2c6dd2a5660a4a12527622278bf02ed", size = 114998, upload-time = "2025-08-24T12:16:54.72Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/24/13/1d08cf0400bf0fe0bb21e70f3f5fab2130aecef962b4362b7a1eba3cd738/geventhttpclient-2.3.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:736aa8e9609e4da40aeff0dbc02fea69021a034f4ed1e99bf93fc2ca83027b64", size = 115690, upload-time = "2025-08-24T12:16:56.328Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/fd/bc/15d22882983cac573859d274783c5b0a95881e553fc312e7b646be432668/geventhttpclient-2.3.4-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:9d477ae1f5d42e1ee6abbe520a2e9c7f369781c3b8ca111d1f5283c1453bc825", size = 121681, upload-time = "2025-08-24T12:16:58.344Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ec/5b/c0c30ccd9d06c603add3f2d6abd68bd98430ee9730dc5478815759cf07f7/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9b50d9daded5d36193d67e2fc30e59752262fcbbdc86e8222c7df6b93af0346a", size = 118987, upload-time = "2025-06-11T13:17:24.97Z" },
|
{ url = "https://files.pythonhosted.org/packages/ec/5b/c0c30ccd9d06c603add3f2d6abd68bd98430ee9730dc5478815759cf07f7/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9b50d9daded5d36193d67e2fc30e59752262fcbbdc86e8222c7df6b93af0346a", size = 118987, upload-time = "2025-06-11T13:17:24.97Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/4f/56/095a46af86476372064128162eccbd2ba4a7721503759890d32ea701d5fd/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:fe705e7656bc6982a463a4ed7f9b1db8c78c08323f1d45d0d1d77063efa0ce96", size = 124519, upload-time = "2025-06-11T13:17:25.933Z" },
|
{ url = "https://files.pythonhosted.org/packages/4f/56/095a46af86476372064128162eccbd2ba4a7721503759890d32ea701d5fd/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:fe705e7656bc6982a463a4ed7f9b1db8c78c08323f1d45d0d1d77063efa0ce96", size = 124519, upload-time = "2025-06-11T13:17:25.933Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ae/12/7c9ba94b58f7954a83d33183152ce6bf5bda10c08ebe47d79a314cd33e29/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:69668589359db4cbb9efa327dda5735d1e74145e6f0a9ffa50236d15cf904053", size = 115574, upload-time = "2025-06-11T13:17:27.331Z" },
|
{ url = "https://files.pythonhosted.org/packages/ae/12/7c9ba94b58f7954a83d33183152ce6bf5bda10c08ebe47d79a314cd33e29/geventhttpclient-2.3.4-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:69668589359db4cbb9efa327dda5735d1e74145e6f0a9ffa50236d15cf904053", size = 115574, upload-time = "2025-06-11T13:17:27.331Z" },
|
||||||
|
|
@ -1151,6 +1157,24 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/ca/36/9065bb51f261950c42eddf8718e01a9ff344d8082e31317a8b6677be9bd6/geventhttpclient-2.3.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8d1d0db89c1c8f3282eac9a22fda2b4082e1ed62a2107f70e3f1de1872c7919f", size = 112245, upload-time = "2025-06-11T13:17:32.331Z" },
|
{ url = "https://files.pythonhosted.org/packages/ca/36/9065bb51f261950c42eddf8718e01a9ff344d8082e31317a8b6677be9bd6/geventhttpclient-2.3.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8d1d0db89c1c8f3282eac9a22fda2b4082e1ed62a2107f70e3f1de1872c7919f", size = 112245, upload-time = "2025-06-11T13:17:32.331Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/21/7e/08a615bec095c288f997951e42e48b262d43c6081bef33cfbfad96ab9658/geventhttpclient-2.3.4-cp313-cp313-win32.whl", hash = "sha256:4e492b9ab880f98f8a9cc143b96ea72e860946eae8ad5fb2837cede2a8f45154", size = 48360, upload-time = "2025-06-11T13:17:33.349Z" },
|
{ url = "https://files.pythonhosted.org/packages/21/7e/08a615bec095c288f997951e42e48b262d43c6081bef33cfbfad96ab9658/geventhttpclient-2.3.4-cp313-cp313-win32.whl", hash = "sha256:4e492b9ab880f98f8a9cc143b96ea72e860946eae8ad5fb2837cede2a8f45154", size = 48360, upload-time = "2025-06-11T13:17:33.349Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ec/19/ef3cb21e7e95b14cfcd21e3ba7fe3d696e171682dfa43ab8c0a727cac601/geventhttpclient-2.3.4-cp313-cp313-win_amd64.whl", hash = "sha256:72575c5b502bf26ececccb905e4e028bb922f542946be701923e726acf305eb6", size = 48956, upload-time = "2025-06-11T13:17:34.956Z" },
|
{ url = "https://files.pythonhosted.org/packages/ec/19/ef3cb21e7e95b14cfcd21e3ba7fe3d696e171682dfa43ab8c0a727cac601/geventhttpclient-2.3.4-cp313-cp313-win_amd64.whl", hash = "sha256:72575c5b502bf26ececccb905e4e028bb922f542946be701923e726acf305eb6", size = 48956, upload-time = "2025-06-11T13:17:34.956Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/06/45/c41697c7d0cae17075ba535fb901985c2873461a9012e536de679525e28d/geventhttpclient-2.3.4-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:503db5dd0aa94d899c853b37e1853390c48c7035132f39a0bab44cbf95d29101", size = 71999, upload-time = "2025-08-24T12:17:00.419Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/5d/f7/1d953cafecf8f1681691977d9da9b647d2e02996c2431fb9b718cfdd3013/geventhttpclient-2.3.4-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:389d3f83316220cfa2010f41401c140215a58ddba548222e7122b2161e25e391", size = 52656, upload-time = "2025-08-24T12:17:01.337Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/5c/ca/4bd19040905e911dd8771a4ab74630eadc9ee9072b01ab504332dada2619/geventhttpclient-2.3.4-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:20c65d404fa42c95f6682831465467dff317004e53602c01f01fbd5ba1e56628", size = 51978, upload-time = "2025-08-24T12:17:02.282Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/11/01/c457257ee41236347dac027e63289fa3f92f164779458bd244b376122bf6/geventhttpclient-2.3.4-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:2574ee47ff6f379e9ef124e2355b23060b81629f1866013aa975ba35df0ed60b", size = 115033, upload-time = "2025-08-24T12:17:03.272Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/cc/c1/ef3ddc24b402eb3caa19dacbcd08d7129302a53d9b9109c84af1ea74e31a/geventhttpclient-2.3.4-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fecf1b735591fb21ea124a374c207104a491ad0d772709845a10d5faa07fa833", size = 115762, upload-time = "2025-08-24T12:17:04.288Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a9/97/8dca246262e9a1ebd639120151db00e34b7d10f60bdbca8481878b91801a/geventhttpclient-2.3.4-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:44e9ba810c28f9635e5c4c9cf98fc6470bad5a3620d8045d08693f7489493a3c", size = 121757, upload-time = "2025-08-24T12:17:05.273Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/10/7b/41bff3cbdeff3d06d45df3c61fa39cd25e60fa9d21c709ec6aeb58e9b58f/geventhttpclient-2.3.4-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:501d5c69adecd5eaee3c22302006f6c16aa114139640873b72732aa17dab9ee7", size = 111747, upload-time = "2025-08-24T12:17:06.585Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/64/e6/3732132fda94082ec8793e3ae0d4d7fff6c1cb8e358e9664d1589499f4b1/geventhttpclient-2.3.4-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:709f557138fb84ed32703d42da68f786459dab77ff2c23524538f2e26878d154", size = 118487, upload-time = "2025-08-24T12:17:07.816Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/93/29/d48d119dee6c42e066330860186df56a80d4e76d2821a6c706ead49006d7/geventhttpclient-2.3.4-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:b8b86815a30e026c6677b89a5a21ba5fd7b69accf8f0e9b83bac123e4e9f3b31", size = 112198, upload-time = "2025-08-24T12:17:08.867Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/56/48/556adff8de1bd3469b58394f441733bb3c76cb22c2600cf2ee753e73d47f/geventhttpclient-2.3.4-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:4371b1b1afc072ad2b0ff5a8929d73ffd86d582908d3e9e8d7911dc027b1b3a6", size = 72354, upload-time = "2025-08-24T12:17:10.671Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7c/77/f1b32a91350382978cde0ddfee4089b94e006eb0f3e7297196d9d5451217/geventhttpclient-2.3.4-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:6409fcda1f40d66eab48afc218b4c41e45a95c173738d10c50bc69c7de4261b9", size = 52835, upload-time = "2025-08-24T12:17:12.164Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/d3/06/124f95556e0d5b4c417ec01fc30d91a3e4fe4524a44d2f629a1b1a721984/geventhttpclient-2.3.4-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:142870c2efb6bd0a593dcd75b83defb58aeb72ceaec4c23186785790bd44a311", size = 52165, upload-time = "2025-08-24T12:17:13.465Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/76/9c/0850256e4461b0a90f2cf5c8156ea8f97e93a826aa76d7be70c9c6d4ba0f/geventhttpclient-2.3.4-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:3a74f7b926badb3b1d47ea987779cb83523a406e89203070b58b20cf95d6f535", size = 117929, upload-time = "2025-08-24T12:17:14.477Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ca/55/3b54d0c0859efac95ba2649aeb9079a3523cdd7e691549ead2862907dc7d/geventhttpclient-2.3.4-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2a8cde016e5ea6eb289c039b6af8dcef6c3ee77f5d753e57b48fe2555cdeacca", size = 119584, upload-time = "2025-08-24T12:17:15.709Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/84/df/84ce132a0eb2b6d4f86e68a828e3118419cb0411cae101e4bad256c3f321/geventhttpclient-2.3.4-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:5aa16f2939a508667093b18e47919376f7db9a9acbe858343173c5a58e347869", size = 125388, upload-time = "2025-08-24T12:17:16.915Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e8/4f/8156b9f6e25e4f18a60149bd2925f56f1ed7a1f8d520acb5a803536adadd/geventhttpclient-2.3.4-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:ffe87eb7f1956357c2144a56814b5ffc927cbb8932f143a0351c78b93129ebbc", size = 115214, upload-time = "2025-08-24T12:17:17.945Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f6/5a/b01657605c16ac4555b70339628a33fc7ca41ace58da167637ef72ad0a8e/geventhttpclient-2.3.4-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:5ee758e37215da9519cea53105b2a078d8bc0a32603eef2a1f9ab551e3767dee", size = 121862, upload-time = "2025-08-24T12:17:18.97Z" },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/84/ca/c4e36a9b1bcce9958d8886aa4f7b262c8e9a7c43a284f2d79abfc9ba715d/geventhttpclient-2.3.4-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:416cc70adb3d34759e782d2e120b4432752399b85ac9758932ecd12274a104c3", size = 114999, upload-time = "2025-08-24T12:17:19.978Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -1743,7 +1767,7 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "llama-stack"
|
name = "llama-stack"
|
||||||
version = "0.2.18"
|
version = "0.2.19"
|
||||||
source = { editable = "." }
|
source = { editable = "." }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "aiohttp" },
|
{ name = "aiohttp" },
|
||||||
|
|
@ -1881,8 +1905,8 @@ requires-dist = [
|
||||||
{ name = "jinja2", specifier = ">=3.1.6" },
|
{ name = "jinja2", specifier = ">=3.1.6" },
|
||||||
{ name = "jsonschema" },
|
{ name = "jsonschema" },
|
||||||
{ name = "llama-api-client", specifier = ">=0.1.2" },
|
{ name = "llama-api-client", specifier = ">=0.1.2" },
|
||||||
{ name = "llama-stack-client", specifier = ">=0.2.18" },
|
{ name = "llama-stack-client", specifier = ">=0.2.19" },
|
||||||
{ name = "llama-stack-client", marker = "extra == 'ui'", specifier = ">=0.2.18" },
|
{ name = "llama-stack-client", marker = "extra == 'ui'", specifier = ">=0.2.19" },
|
||||||
{ name = "openai", specifier = ">=1.99.6,<1.100.0" },
|
{ name = "openai", specifier = ">=1.99.6,<1.100.0" },
|
||||||
{ name = "opentelemetry-exporter-otlp-proto-http", specifier = ">=1.30.0" },
|
{ name = "opentelemetry-exporter-otlp-proto-http", specifier = ">=1.30.0" },
|
||||||
{ name = "opentelemetry-sdk", specifier = ">=1.30.0" },
|
{ name = "opentelemetry-sdk", specifier = ">=1.30.0" },
|
||||||
|
|
@ -1989,7 +2013,7 @@ unit = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "llama-stack-client"
|
name = "llama-stack-client"
|
||||||
version = "0.2.18"
|
version = "0.2.19"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "anyio" },
|
{ name = "anyio" },
|
||||||
|
|
@ -2008,9 +2032,9 @@ dependencies = [
|
||||||
{ name = "tqdm" },
|
{ name = "tqdm" },
|
||||||
{ name = "typing-extensions" },
|
{ name = "typing-extensions" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/69/da/5e5a745495f8a2b8ef24fc4d01fe9031aa2277c36447cb22192ec8c8cc1e/llama_stack_client-0.2.18.tar.gz", hash = "sha256:860c885c9e549445178ac55cc9422e6e2a91215ac7aff5aaccfb42f3ce07e79e", size = 277284, upload-time = "2025-08-19T22:12:09.106Z" }
|
sdist = { url = "https://files.pythonhosted.org/packages/14/e4/72683c10188ae93e97551ab6eeac725e46f13ec215618532505a7d91bf2b/llama_stack_client-0.2.19.tar.gz", hash = "sha256:6c857e528b83af7821120002ebe4d3db072fd9f7bf867a152a34c70fe606833f", size = 318325, upload-time = "2025-08-26T21:54:20.592Z" }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/0a/e4/e97f8fdd8a07aa1efc7f7e37b5657d84357b664bf70dd1885a437edc0699/llama_stack_client-0.2.18-py3-none-any.whl", hash = "sha256:90f827d5476f7fc15fd993f1863af6a6e72bd064646bf6a99435eb43a1327f70", size = 367586, upload-time = "2025-08-19T22:12:07.899Z" },
|
{ url = "https://files.pythonhosted.org/packages/51/51/c8dde9fae58193a539eac700502876d8edde8be354c2784ff7b707a47432/llama_stack_client-0.2.19-py3-none-any.whl", hash = "sha256:478565a54541ca03ca9f8fe2019f4136f93ab6afe9591bdd44bc6dde6ddddbd9", size = 369905, upload-time = "2025-08-26T21:54:18.929Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -4713,9 +4737,9 @@ dependencies = [
|
||||||
{ name = "typing-extensions", marker = "sys_platform == 'darwin'" },
|
{ name = "typing-extensions", marker = "sys_platform == 'darwin'" },
|
||||||
]
|
]
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:a47b7986bee3f61ad217d8a8ce24605809ab425baf349f97de758815edd2ef54" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp312-none-macosx_11_0_arm64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:fbe2e149c5174ef90d29a5f84a554dfaf28e003cb4f61fa2c8c024c17ec7ca58" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-cp313t-macosx_14_0_arm64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:057efd30a6778d2ee5e2374cd63a63f63311aa6f33321e627c655df60abdd390" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-none-macosx_11_0_arm64.whl" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -4738,19 +4762,19 @@ dependencies = [
|
||||||
{ name = "typing-extensions", marker = "sys_platform != 'darwin'" },
|
{ name = "typing-extensions", marker = "sys_platform != 'darwin'" },
|
||||||
]
|
]
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-linux_s390x.whl", hash = "sha256:0e34e276722ab7dd0dffa9e12fe2135a9b34a0e300c456ed7ad6430229404eb5" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-linux_s390x.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:610f600c102386e581327d5efc18c0d6edecb9820b4140d26163354a99cd800d" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-manylinux_2_28_aarch64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:cb9a8ba8137ab24e36bf1742cb79a1294bd374db570f09fc15a5e1318160db4e" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-manylinux_2_28_x86_64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-win_amd64.whl", hash = "sha256:2be20b2c05a0cce10430cc25f32b689259640d273232b2de357c35729132256d" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-win_amd64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-win_arm64.whl", hash = "sha256:99fc421a5d234580e45957a7b02effbf3e1c884a5dd077afc85352c77bf41434" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-win_arm64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-linux_s390x.whl", hash = "sha256:8b5882276633cf91fe3d2d7246c743b94d44a7e660b27f1308007fdb1bb89f7d" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-linux_s390x.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:a5064b5e23772c8d164068cc7c12e01a75faf7b948ecd95a0d4007d7487e5f25" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_aarch64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:8f81dedb4c6076ec325acc3b47525f9c550e5284a18eae1d9061c543f7b6e7de" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_amd64.whl", hash = "sha256:e1ee1b2346ade3ea90306dfbec7e8ff17bc220d344109d189ae09078333b0856" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_amd64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_arm64.whl", hash = "sha256:64c187345509f2b1bb334feed4666e2c781ca381874bde589182f81247e61f88" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_arm64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:af81283ac671f434b1b25c95ba295f270e72db1fad48831eb5e4748ff9840041" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_aarch64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:a9dbb6f64f63258bc811e2c0c99640a81e5af93c531ad96e95c5ec777ea46dab" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_x86_64.whl" },
|
||||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-win_amd64.whl", hash = "sha256:6d93a7165419bc4b2b907e859ccab0dea5deeab261448ae9a5ec5431f14c0e64" },
|
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-win_amd64.whl" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue