mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-30 16:53:12 +00:00
This doesn't get Groq to 100% on the OpenAI API verification tests, but it does get it to 88.2% when Llama Stack is in the middle, compared to the 61.8% results for using an OpenAI client against Groq directly. The groq provider doesn't use litellm under the covers in its openai_chat_completion endpoint, and instead directly uses an AsyncOpenAI client with some special handling to improve conformance of responses for response_format usage and tool calling. Signed-off-by: Ben Browning <bbrownin@redhat.com>
14 lines
515 B
YAML
14 lines
515 B
YAML
base_url: https://api.groq.com/openai/v1
|
|
api_key_var: GROQ_API_KEY
|
|
models:
|
|
- llama-3.3-70b-versatile
|
|
- meta-llama/llama-4-scout-17b-16e-instruct
|
|
- meta-llama/llama-4-maverick-17b-128e-instruct
|
|
model_display_names:
|
|
llama-3.3-70b-versatile: Llama-3.3-70B-Instruct
|
|
meta-llama/llama-4-scout-17b-16e-instruct: Llama-4-Scout-Instruct
|
|
meta-llama/llama-4-maverick-17b-128e-instruct: Llama-4-Maverick-Instruct
|
|
test_exclusions:
|
|
llama-3.3-70b-versatile:
|
|
- test_chat_non_streaming_image
|
|
- test_chat_streaming_image
|