llama-stack-mirror/llama_stack/providers/remote/inference
2025-09-23 16:21:31 -04:00
..
anthropic chore: remove duplicate AnthropicProviderDataValidator (#3512) 2025-09-20 16:09:27 -07:00
azure feat: add Azure OpenAI inference provider support (#3396) 2025-09-11 13:48:38 +02:00
bedrock fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386) 2025-09-11 11:41:53 +02:00
cerebras feat: update Cerebras inference provider to support dynamic model listing 2025-09-18 06:54:05 -04:00
databricks feat: (re-)enable Databricks inference adapter (#3500) 2025-09-23 15:37:23 -04:00
fireworks chore: Refactor fireworks to use OpenAIMixin (#3480) 2025-09-22 13:19:36 -04:00
gemini chore: remove duplicate OpenAI and Gemini data validators (#3513) 2025-09-22 13:53:17 +02:00
groq chore: update the groq inference impl to use openai-python for openai-compat functions (#3348) 2025-09-06 15:36:27 -07:00
llama_openai_compat chore: indicate to mypy that InferenceProvider.rerank is concrete (#3238) 2025-08-22 12:02:13 -07:00
nvidia docs: add VLM NIM example (#3277) 2025-08-29 16:23:52 -07:00
ollama chore: update the ollama inference impl to use OpenAIMixin for openai-compat functions (#3395) 2025-09-18 13:09:57 +02:00
openai chore: remove duplicate OpenAI and Gemini data validators (#3513) 2025-09-22 13:53:17 +02:00
passthrough chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
runpod ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
sambanova chore: update the sambanova inference impl to use openai-python for openai-compat functions (#3345) 2025-09-06 12:25:13 -07:00
tgi feat: add dynamic model registration support to TGI inference (#3417) 2025-09-15 15:52:40 -04:00
together feat: add embedding and dynamic model support to Together inference adapter (#3458) 2025-09-16 11:53:41 -07:00
vertexai ci: Re-enable pre-commit to fail (#3399) 2025-09-10 10:00:46 -04:00
vllm fix(dev): fix vllm inference recording (await models.list) (#3524) 2025-09-23 12:56:33 -04:00
watsonx chore: various watsonx fixes (#3428) 2025-09-16 13:55:10 +02:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00