llama-stack-mirror/llama_stack/providers/remote/inference/nvidia
Ben Browning a4b573d750 Fix OpenAI API response format handling
This fixes the schema of OpenAI API chat completion response formats,
including how those response formats (and other nested parameters in
the chat completion request) get translated into paramters for calls
to the backend OpenAI-compatible providers.

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-04-13 13:39:56 -04:00
..
__init__.py add NVIDIA NIM inference adapter (#355) 2024-11-23 15:59:00 -08:00
config.py chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
models.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
nvidia.py Fix OpenAI API response format handling 2025-04-13 13:39:56 -04:00
openai_utils.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
utils.py style: remove prints in codebase (#1146) 2025-02-18 19:41:37 -08:00