llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-08 04:54:38 +00:00

History

Ben Browning 825ce39879 fix: Together provider shutdown and default to non-streaming (#2001 ) # What does this PR do? The together inference provider was throwing a stack trace every time it shut down, as it was trying to call a non-existent `close` method on the AsyncTogether client. While fixing that, I also adjusted its shutdown logic to close the OpenAI client if we've created one of those, as that client does have a `close` method. In testing that, I also realized we were defaulting to treating all requests as streaming requests instead of defaulting to non-streaming. So, this flips that default to non-streaming to match how the other providers work. ## Test Plan I tested this by ensuring the together inference provider no longer spits out a long stack trace when shutting it down and by running the OpenAI API chat completion verification suite to ensure the change in default streaming logic didn't mess anything else up. Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-04-22 17:47:53 +02:00
..
anthropic	feat(providers): Groq now uses LiteLLM openai-compat (#1303 )	2025-02-27 13:16:50 -08:00
bedrock	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
cerebras	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
cerebras_openai_compat	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
databricks	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
fireworks	fix: OpenAI Completions API and Fireworks (#1997 )	2025-04-21 11:49:12 -07:00
fireworks_openai_compat	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
gemini	feat(providers): Groq now uses LiteLLM openai-compat (#1303 )	2025-02-27 13:16:50 -08:00
groq	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
groq_openai_compat	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
nvidia	feat: update nvidia inference provider to use model_store (#1988 )	2025-04-18 10:16:43 +02:00
ollama	feat: allow ollama to use 'latest' if available but not specified (#1903 )	2025-04-14 09:03:54 -07:00
openai	feat(providers): Groq now uses LiteLLM openai-compat (#1303 )	2025-02-27 13:16:50 -08:00
passthrough	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
runpod	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
sambanova	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
sambanova_openai_compat	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
tgi	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
together	fix: Together provider shutdown and default to non-streaming (#2001 )	2025-04-22 17:47:53 +02:00
together_openai_compat	test: verification on provider's OAI endpoints (#1893 )	2025-04-07 23:06:28 -07:00
vllm	fix: Do not send an empty 'tools' list to remote vllm (#1957 )	2025-04-15 20:31:12 -04:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00