llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-05 20:27:35 +00:00

History

Ashwin Bharambe b0310af177 refactor: move OpenAI compat utilities from nvidia to openai_compat (#1258 ) # What does this PR do? This PR: - refactors code which converts between Llama Stack <> OpenAI compat servers which was used by the nvidia implementation to be used more broadly. Next PRs in the stack will show usage. - adds incremental tool call parsing (when tool calls are streamed incrementally, not just whole-sale) ## Test Plan Run ```bash pytest -s -v -k nvidia llama_stack/providers/tests/inference/ --env NVIDIA_API_KEY=.... ``` Text model tests pass (albeit without completions tests) ``` test_text_inference.py::TestInference::test_model_list[-nvidia] PASSED test_text_inference.py::TestInference::test_text_completion_non_streaming[-nvidia-inference:completion:non_streaming] FAILED test_text_inference.py::TestInference::test_text_completion_streaming[-nvidia-inference:completion:streaming] FAILED test_text_inference.py::TestInference::test_text_completion_logprobs_non_streaming[-nvidia-inference:completion:logprobs_non_streaming] FAILED test_text_inference.py::TestInference::test_text_completion_logprobs_streaming[-nvidia-inference:completion:logprobs_streaming] FAILED test_text_inference.py::TestInference::test_text_completion_structured_output[-nvidia-inference:completion:structured_output] FAILED test_text_inference.py::TestInference::test_text_chat_completion_non_streaming[-nvidia-inference:chat_completion:sample_messages] PASSED test_text_inference.py::TestInference::test_text_chat_completion_structured_output[-nvidia-inference:chat_completion:structured_output] PASSED test_text_inference.py::TestInference::test_text_chat_completion_streaming[-nvidia-inference:chat_completion:sample_messages] PASSED test_text_inference.py::TestInference::test_text_chat_completion_with_tool_calling[-nvidia-inference:chat_completion:sample_messages_tool_calling] PASSED test_text_inference.py::TestInference::test_text_chat_completion_with_tool_calling_streaming[-nvidia-inference:chat_completion:sample_messages_tool_calling] PASSED ``` Vision model tests don't: ``` FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-nvidia-image0-expected_strings0] - openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'} FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-nvidia-image1-expected_strings1] - openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'} FAILED test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_streaming[-nvidia] - openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "[{'type': 'string_type', 'loc': ('body', 'messages', 1, 'content'), 'msg': 'Input should be a valid string', 'input': [{'image_url': {'url': 'https://raw.githubusercontent.com/meta-llama/llam... ```		2025-02-25 22:02:11 -08:00
..
groq	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00
__init__.py	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
conftest.py	refactor: move OpenAI compat utilities from nvidia to openai_compat (#1258 )	2025-02-25 22:02:11 -08:00
fixtures.py	refactor: move OpenAI compat utilities from nvidia to openai_compat (#1258 )	2025-02-25 22:02:11 -08:00
pasta.jpeg	Enable vision models for (Together, Fireworks, Meta-Reference, Ollama) (#376 )	2024-11-05 16:22:33 -08:00
test_embeddings.py	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
test_model_registration.py	test: skip model registration for unsupported providers (#1030 )	2025-02-19 22:39:13 -08:00
test_prompt_adapter.py	fix: pass tool_prompt_format to chat_formatter (#1198 )	2025-02-20 21:38:35 -08:00
test_text_inference.py	feat: completing text /chat-completion and /completion tests (#1223 )	2025-02-25 11:37:04 -08:00
test_vision_inference.py	build: configure ruff from pyproject.toml (#1100 )	2025-02-14 09:01:57 -08:00
utils.py	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00