llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 07:00:02 +00:00

History

Ben Browning a5827f7cb3 Nvidia provider support for OpenAI API endpoints This wires up the openai_completion and openai_chat_completion API methods for the remote Nvidia inference provider, and adds it to the chat completions part of the OpenAI test suite. The hosted Nvidia service doesn't actually host any Llama models with functioning completions and chat completions endpoints, so for now the test suite only activates the nvidia provider for chat completions. Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-04-10 13:43:28 -04:00
..
agents	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
datasetio	refactor: extract pagination logic into shared helper function (#1770 )	2025-03-31 13:08:29 -07:00
inference	Nvidia provider support for OpenAI API endpoints	2025-04-10 13:43:28 -04:00
post_training	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
safety	feat: added nvidia as safety provider (#1248 )	2025-03-17 14:39:23 -07:00
tool_runtime	fix(api): don't return list for runtime tools (#1686 )	2025-04-01 09:53:11 +02:00
vector_io	chore: Updating Milvus Client calls to be non-blocking (#1830 )	2025-03-28 22:14:07 -04:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00