mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-26 06:07:43 +00:00
- Add new Vertex AI remote inference provider with litellm integration - Support for Gemini models through Google Cloud Vertex AI platform - Uses Google Cloud Application Default Credentials (ADC) for authentication - Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash. - Updated provider registry to include vertexai provider - Updated starter template to support Vertex AI configuration - Added comprehensive documentation and sample configuration Signed-off-by: Eran Cohen <eranco@redhat.com>
1.1 KiB
1.1 KiB
Inference Providers
This section contains documentation for all available providers for the inference API.
- inline::meta-reference
- inline::sentence-transformers
- remote::anthropic
- remote::bedrock
- remote::cerebras
- remote::databricks
- remote::fireworks
- remote::gemini
- remote::groq
- remote::hf::endpoint
- remote::hf::serverless
- remote::llama-openai-compat
- remote::nvidia
- remote::ollama
- remote::openai
- remote::passthrough
- remote::runpod
- remote::sambanova
- remote::tgi
- remote::together
- remote::vertexai
- remote::vllm
- remote::watsonx