refactor: move OpenAI compat utilities from nvidia to openai_compat (#1258)

forked from phoenix-oss/llama-stack-mirror

# What does this PR do?

This PR:
- refactors code which converts between Llama Stack <> OpenAI compat
servers which was used by the nvidia implementation to be used more
broadly. Next PRs in the stack will show usage.
- adds incremental tool call parsing (when tool calls are streamed
incrementally, not just whole-sale)

## Test Plan

Run 

```bash
pytest -s -v -k nvidia llama_stack/providers/tests/inference/ --env NVIDIA_API_KEY=....
```

Text model tests pass (albeit without completions tests)
```
test_text_inference.py::TestInference::test_model_list[-nvidia] PASSED
test_text_inference.py::TestInference::test_text_completion_non_streaming[-nvidia-inference:completion:non_streaming] FAILED
test_text_inference.py::TestInference::test_text_completion_streaming[-nvidia-inference:completion:streaming] FAILED
test_text_inference.py::TestInference::test_text_completion_logprobs_non_streaming[-nvidia-inference:completion:logprobs_non_streaming] FAILED
test_text_inference.py::TestInference::test_text_completion_logprobs_streaming[-nvidia-inference:completion:logprobs_streaming] FAILED
test_text_inference.py::TestInference::test_text_completion_structured_output[-nvidia-inference:completion:structured_output] FAILED
test_text_inference.py::TestInference::test_text_chat_completion_non_streaming[-nvidia-inference:chat_completion:sample_messages] PASSED
test_text_inference.py::TestInference::test_text_chat_completion_structured_output[-nvidia-inference:chat_completion:structured_output] PASSED
test_text_inference.py::TestInference::test_text_chat_completion_streaming[-nvidia-inference:chat_completion:sample_messages] PASSED
test_text_inference.py::TestInference::test_text_chat_completion_with_tool_calling[-nvidia-inference:chat_completion:sample_messages_tool_calling] PASSED
test_text_inference.py::TestInference::test_text_chat_completion_with_tool_calling_streaming[-nvidia-inference:chat_completion:sample_messages_tool_calling] PASSED
```

Vision model tests don't:
```
FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-nvidia-image0-expected_strings0] - openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'}
FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-nvidia-image1-expected_strings1] - openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'}
FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_streaming[-nvidia] - openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "[{'type': 'string_type', 'loc': ('body', 'messages', 1, 'content'), 'msg': 'Input should be a valid string', 'input': [{'image_url': {'url': 'https://raw.githubusercontent.com/meta-llama/llam...
```

This commit is contained in:

Ashwin Bharambe

2025-02-25 22:02:11 -08:00

• committed by

GitHub

parent 82799a55bb

commit b0310af177

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

5 changed files with 538 additions and 459 deletions

									
										3

llama_stack/providers/tests/inference/conftest.py
									
										View file
										
				@ -46,9 +46,10 @@ def pytest_generate_tests(metafunc):

				            if ("Vision" in cls_name and "Vision" in model) or ("Vision" not in cls_name and "Vision" not in model):

				                params.append(pytest.param(model, id=model))

				        print(f"params: {params}")

				        if not params:

				            model = metafunc.config.getoption("--inference-model")

				            params = [pytest.param(model, id="")]

				            params = [pytest.param(model, id=model)]

				        metafunc.parametrize(

				            "inference_model",

Rows
Columns

refactor: move OpenAI compat utilities from nvidia to openai_compat (#1258)

3 llama_stack/providers/tests/inference/conftest.py Unescape Escape View file

3

llama_stack/providers/tests/inference/conftest.py

View file