tests: adapt openai test for watsonx

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-12 20:12:33 +00:00

The
tests/integration/inference/test_openai_completion.py tests fail on a
few scenarios like:

tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n

FAILED tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] - AssertionError: assert 1 == 2
 +  where 1 = len({0: 'thethenamenameofofthetheususcapitalcapitalisiswashingtonwashington,,dd.c.c..'})

test_openai_completion_logprobs
E   openai.BadRequestError: Error code: 400 - {'error': {'detail': {'errors': [{'loc': ['body', 'logprobs'], 'msg': 'Input should be a valid boolean, unable to interpret input', 'type': 'bool_parsing'}]}}}

test_openai_completion_stop_sequence
E   openai.BadRequestError: Error code: 400 - {'detail': 'litellm.BadRequestError: OpenAIException - {"errors":[{"code":"json_type_error","message":"Json field type error: CommonTextChatParameters.stop must be an array, and the element must be of type string","more_info":"https://cloud.ibm.com/apidocs/watsonx-ai#text-chat"}],"trace":"f758b3bbd4f357aa9b16f3dc5ee1170e","status_code":400}'}

So adding the right exception but we still provide some coverage for
openai through litellm.

Now tests pass:

```
INFO     2025-10-14 14:20:17,115 tests.integration.conftest:50 tests: Test stack config type: library_client
         (stack_config=None)
======================================================== test session starts =========================================================
platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3
cachedir: .pytest_cache
metadata: {'Python': '3.12.8', 'Platform': 'macOS-26.0.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}}
rootdir: /Users/leseb/Documents/AI/llama-stack
configfile: pyproject.toml
plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 32 items

tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [  3%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:suffix] SKIPPED [  6%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [  9%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=meta-llama/llama-3-3-70b-instruct] SKIPPED [ 12%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 15%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] PASSED [ 18%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] SKIPPED [ 21%]
tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=meta-llama/llama-3-3-70b-instruct-True] PASSED [ 25%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=meta-llama/llama-3-3-70b-instruct-True] PASSED [ 28%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=meta-llama/llama-3-3-70b-instruct] SKIPPEDfiles.) [ 31%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_stop_sequence[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] SKIPPED [ 34%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_logprobs[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] SKIPPED [ 37%]
tests/integration/inference/test_openai_completion.py::test_openai_completion_logprobs_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] SKIPPED [ 40%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_with_tools[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 43%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_with_tools_and_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 46%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_with_tool_choice_none[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 50%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_structured_output[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:structured_output] PASSED [ 53%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 56%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] PASSED [ 59%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] SKIPPED [ 62%]
tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=meta-llama/llama-3-3-70b-instruct-False] PASSED [ 65%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=meta-llama/llama-3-3-70b-instruct-False] PASSED [ 68%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 71%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] PASSED [ 75%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] SKIPPED [ 78%]
tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-True] PASSED [ 81%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-True] PASSED [ 84%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 87%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] PASSED [ 90%]
tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] SKIPPED [ 93%]
tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-False] PASSED [ 96%]
tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-False] PASSED [100%]

======================================================== slowest 10 durations ========================================================
5.97s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_with_tool_choice_none[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling]
3.39s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02]
3.26s call     tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=meta-llama/llama-3-3-70b-instruct-True]
2.64s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_with_tools_and_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling]
1.78s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_structured_output[txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:structured_output]
1.73s call     tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:sanity]
1.58s call     tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-True]
1.51s call     tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=meta-llama/llama-3-3-70b-instruct-inference:completion:sanity]
1.41s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02]
1.20s call     tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02]
====================================================== short test summary info =======================================================
SKIPPED [1] tests/integration/inference/test_openai_completion.py:85: Suffix is not supported for the model: meta-llama/llama-3-3-70b-instruct.
SKIPPED [1] tests/integration/inference/test_openai_completion.py:135: Model meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support vllm extra_body parameters.
SKIPPED [4] tests/integration/inference/test_openai_completion.py:115: Model meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support n param.
SKIPPED [1] tests/integration/inference/test_openai_completion.py:141: Model meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support chat completion calls with base64 encoded files.
SKIPPED [1] tests/integration/inference/test_openai_completion.py:514: Model meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support /v1/completions stop sequence.
SKIPPED [2] tests/integration/inference/test_openai_completion.py:72: Model meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support /v1/completions logprobs.
============================================ 22 passed, 10 skipped, 2 warnings in 35.11s =============================================
```

Signed-off-by: Sébastien Han <seb@redhat.com>

This commit is contained in:

Sébastien Han

2025-10-14 14:06:51 +02:00

parent effe7609a9

commit 53eda78993

No known key found for this signature in database

1 changed files with 9 additions and 2 deletions

									
										11

tests/integration/inference/test_openai_completion.py
									
										View file
										
					@ -58,7 +58,6 @@ def skip_if_model_doesnt_support_openai_completion(client_with_models, model_id)

					        #  does not work with the specified model, gpt-5-mini. Please choose different model and try

					        #  does not work with the specified model, gpt-5-mini. Please choose different model and try

					        #  again. You can learn more about which models can be used with each operation here:

					        #  again. You can learn more about which models can be used with each operation here:

					        #  https://go.microsoft.com/fwlink/?linkid=2197993.'}}"}

					        #  https://go.microsoft.com/fwlink/?linkid=2197993.'}}"}

					        "remote::watsonx",  # return 404 when hitting the /openai/v1 endpoint

					        "remote::llama-openai-compat",

					        "remote::llama-openai-compat",

					    ):

					    ):

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support OpenAI completions.")

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support OpenAI completions.")

					@ -68,6 +67,7 @@ def skip_if_doesnt_support_completions_logprobs(client_with_models, model_id):

					    provider_type = provider_from_model(client_with_models, model_id).provider_type

					    provider_type = provider_from_model(client_with_models, model_id).provider_type

					    if provider_type in (

					    if provider_type in (

					        "remote::ollama",  # logprobs is ignored

					        "remote::ollama",  # logprobs is ignored

					        "remote::watsonx",

					    ):

					    ):

					        pytest.skip(f"Model {model_id} hosted by {provider_type} doesn't support /v1/completions logprobs.")

					        pytest.skip(f"Model {model_id} hosted by {provider_type} doesn't support /v1/completions logprobs.")

					@ -110,6 +110,7 @@ def skip_if_doesnt_support_n(client_with_models, model_id):

					        # Error code 400 - {'message': '"n" > 1 is not currently supported', 'type': 'invalid_request_error', 'param': 'n', 'code': 'wrong_api_format'}

					        # Error code 400 - {'message': '"n" > 1 is not currently supported', 'type': 'invalid_request_error', 'param': 'n', 'code': 'wrong_api_format'}

					        "remote::cerebras",

					        "remote::cerebras",

					        "remote::databricks",  # Bad request: parameter "n" must be equal to 1 for streaming mode

					        "remote::databricks",  # Bad request: parameter "n" must be equal to 1 for streaming mode

					        "remote::watsonx",

					    ):

					    ):

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support n param.")

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support n param.")

					@ -124,7 +125,6 @@ def skip_if_model_doesnt_support_openai_chat_completion(client_with_models, mode

					        "remote::databricks",

					        "remote::databricks",

					        "remote::cerebras",

					        "remote::cerebras",

					        "remote::runpod",

					        "remote::runpod",

					        "remote::watsonx",  # watsonx returns 404 when hitting the /openai/v1 endpoint

					    ):

					    ):

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support OpenAI chat completions.")

					        pytest.skip(f"Model {model_id} hosted by {provider.provider_type} doesn't support OpenAI chat completions.")

					@ -508,6 +508,12 @@ def test_openai_chat_completion_non_streaming_with_file(openai_client, client_wi

					    assert "hello world" in normalized_content

					    assert "hello world" in normalized_content

					def skip_if_doesnt_support_completions_stop_sequence(client_with_models, model_id):

					    provider_type = provider_from_model(client_with_models, model_id).provider_type

					    if provider_type in ("remote::watsonx",):  # openai.BadRequestError: Error code: 400

					        pytest.skip(f"Model {model_id} hosted by {provider_type} doesn't support /v1/completions stop sequence.")

					@pytest.mark.parametrize(

					@pytest.mark.parametrize(

					    "test_case",

					    "test_case",

					    [

					    [

					@ -516,6 +522,7 @@ def test_openai_chat_completion_non_streaming_with_file(openai_client, client_wi

					)

					)

					def test_openai_completion_stop_sequence(client_with_models, openai_client, text_model_id, test_case):

					def test_openai_completion_stop_sequence(client_with_models, openai_client, text_model_id, test_case):

					    skip_if_model_doesnt_support_openai_completion(client_with_models, text_model_id)

					    skip_if_model_doesnt_support_openai_completion(client_with_models, text_model_id)

					    skip_if_doesnt_support_completions_stop_sequence(client_with_models, text_model_id)

					    tc = TestCase(test_case)

					    tc = TestCase(test_case)

Rows
Columns

tests: adapt openai test for watsonx

11 tests/integration/inference/test_openai_completion.py Unescape Escape View file

11

tests/integration/inference/test_openai_completion.py

View file