mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-30 16:53:12 +00:00
This doesn't get Groq to 100% on the OpenAI API verification tests, but it does get it to 88.2% when Llama Stack is in the middle, compared to the 61.8% results for using an OpenAI client against Groq directly. The groq provider doesn't use litellm under the covers in its openai_chat_completion endpoint, and instead directly uses an AsyncOpenAI client with some special handling to improve conformance of responses for response_format usage and tool calling. Signed-off-by: Ben Browning <bbrownin@redhat.com>
9 lines
217 B
YAML
9 lines
217 B
YAML
base_url: http://localhost:8321/v1/openai/v1
|
|
api_key_var: OPENAI_API_KEY
|
|
models:
|
|
- openai/gpt-4o
|
|
- openai/gpt-4o-mini
|
|
model_display_names:
|
|
openai/gpt-4o: gpt-4o
|
|
openai/gpt-4o-mini: gpt-4o-mini
|
|
test_exclusions: {}
|