mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-14 14:27:58 +00:00
# What does this PR do? ## Test Plan # What does this PR do? ## Test Plan # What does this PR do? ## Test Plan Completes the refactoring started in previous commit by: 1. **Fix library client** (critical): Add logic to detect Pydantic model parameters and construct them properly from request bodies. The key fix is to NOT exclude any params when converting the body for Pydantic models - we need all fields to pass to the Pydantic constructor. Before: _convert_body excluded all params, leaving body empty for Pydantic construction After: Check for Pydantic params first, skip exclusion, construct model with full body 2. **Update remaining providers** to use new Pydantic-based signatures: - litellm_openai_mixin: Extract extra fields via __pydantic_extra__ - databricks: Use TYPE_CHECKING import for params type - llama_openai_compat: Use TYPE_CHECKING import for params type - sentence_transformers: Update method signatures to use params 3. **Update unit tests** to use new Pydantic signature: - test_openai_mixin.py: Use OpenAIChatCompletionRequestParams This fixes test failures where the library client was trying to construct Pydantic models with empty dictionaries. The previous fix had a bug: it called _convert_body() which only keeps fields that match function parameter names. For Pydantic methods with signature: openai_chat_completion(params: OpenAIChatCompletionRequestParams) The signature only has 'params', but the body has 'model', 'messages', etc. So _convert_body() returned an empty dict. Fix: Skip _convert_body() entirely for Pydantic params. Use the raw body directly to construct the Pydantic model (after stripping NOT_GIVENs). This properly fixes the ValidationError where required fields were missing. The streaming code path (_call_streaming) had the same issue as non-streaming: it called _convert_body() which returned empty dict for Pydantic params. Applied the same fix as commit 7476c0ae: - Detect Pydantic model parameters before body conversion - Skip _convert_body() for Pydantic params - Construct Pydantic model directly from raw body (after stripping NOT_GIVENs) This fixes streaming endpoints like openai_chat_completion with stream=True. The streaming code path (_call_streaming) had the same issue as non-streaming: it called _convert_body() which returned empty dict for Pydantic params. Applied the same fix as commit 7476c0ae: - Detect Pydantic model parameters before body conversion - Skip _convert_body() for Pydantic params - Construct Pydantic model directly from raw body (after stripping NOT_GIVENs) This fixes streaming endpoints like openai_chat_completion with stream=True.
59 lines
1.8 KiB
JSON
59 lines
1.8 KiB
JSON
{
|
|
"test_id": "tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-emb=ollama/all-minilm:l6-v2]",
|
|
"request": {
|
|
"method": "POST",
|
|
"url": "http://localhost:11434/api/ps",
|
|
"headers": {},
|
|
"body": {},
|
|
"endpoint": "/api/ps",
|
|
"model": ""
|
|
},
|
|
"response": {
|
|
"body": {
|
|
"__type__": "ollama._types.ProcessResponse",
|
|
"__data__": {
|
|
"models": [
|
|
{
|
|
"model": "all-minilm:l6-v2",
|
|
"name": "all-minilm:l6-v2",
|
|
"digest": "1b226e2802dbb772b5fc32a58f103ca1804ef7501331012de126ab22f67475ef",
|
|
"expires_at": "2025-10-09T11:53:04.860262-07:00",
|
|
"size": 585846784,
|
|
"size_vram": 585846784,
|
|
"details": {
|
|
"parent_model": "",
|
|
"format": "gguf",
|
|
"family": "bert",
|
|
"families": [
|
|
"bert"
|
|
],
|
|
"parameter_size": "23M",
|
|
"quantization_level": "F16"
|
|
},
|
|
"context_length": 256
|
|
},
|
|
{
|
|
"model": "llama3.2:3b-instruct-fp16",
|
|
"name": "llama3.2:3b-instruct-fp16",
|
|
"digest": "195a8c01d91ec3cb1e0aad4624a51f2602c51fa7d96110f8ab5a20c84081804d",
|
|
"expires_at": "2025-10-09T11:53:03.729916-07:00",
|
|
"size": 7919570944,
|
|
"size_vram": 7919570944,
|
|
"details": {
|
|
"parent_model": "",
|
|
"format": "gguf",
|
|
"family": "llama",
|
|
"families": [
|
|
"llama"
|
|
],
|
|
"parameter_size": "3.2B",
|
|
"quantization_level": "F16"
|
|
},
|
|
"context_length": 4096
|
|
}
|
|
]
|
|
}
|
|
},
|
|
"is_streaming": false
|
|
}
|
|
}
|