mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-26 06:07:43 +00:00

Eran Cohen 1f421238b8 feat: Add Google Vertex AI inference provider support

- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration

Signed-off-by: Eran Cohen <eranco@redhat.com>

2025-07-24 09:49:23 +03:00

1.1 KiB

Raw Blame History

Inference Providers

This section contains documentation for all available providers for the inference API.

inline::meta-reference
inline::sentence-transformers
remote::anthropic
remote::bedrock
remote::cerebras
remote::databricks
remote::fireworks
remote::gemini
remote::groq
remote::hf::endpoint
remote::hf::serverless
remote::llama-openai-compat
remote::nvidia
remote::ollama
remote::openai
remote::passthrough
remote::runpod
remote::sambanova
remote::tgi
remote::together
remote::vertexai
remote::vllm
remote::watsonx

1.1 KiB Raw Blame History

Inference Providers

1.1 KiB

Raw Blame History