llama-stack-mirror/llama_stack/providers/remote
Ben Browning 8a1c0a1008 Improve groq OpenAI API compatibility
This doesn't get Groq to 100% on the OpenAI API verification tests,
but it does get it to 88.2% when Llama Stack is in the middle,
compared to the 61.8% results for using an OpenAI client against Groq
directly.

The groq provider doesn't use litellm under the covers in its
openai_chat_completion endpoint, and instead directly uses an
AsyncOpenAI client with some special handling to improve conformance
of responses for response_format usage and tool calling.

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-04-13 13:41:52 -04:00
..
agents test: add unit test to ensure all config types are instantiable (#1601) 2025-03-12 22:29:58 -07:00
datasetio refactor: extract pagination logic into shared helper function (#1770) 2025-03-31 13:08:29 -07:00
inference Improve groq OpenAI API compatibility 2025-04-13 13:41:52 -04:00
post_training fix: remove extra sft args in NvidiaPostTrainingAdapter (#1939) 2025-04-11 10:17:57 -07:00
safety feat: Add unit tests for NVIDIA safety (#1897) 2025-04-11 11:49:55 -07:00
tool_runtime fix(api): don't return list for runtime tools (#1686) 2025-04-01 09:53:11 +02:00
vector_io chore: Updating Milvus Client calls to be non-blocking (#1830) 2025-03-28 22:14:07 -04:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00