mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-26 06:07:43 +00:00

Eran Cohen 1f421238b8 feat: Add Google Vertex AI inference provider support

- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration

Signed-off-by: Eran Cohen <eranco@redhat.com>

2025-07-24 09:49:23 +03:00

1.4 KiB

Raw Blame History

remote::vertexai

Description

Google Vertex AI inference provider enables you to use Google's Gemini models through Google Cloud's Vertex AI platform, providing several advantages:

• Enterprise-grade security: Uses Google Cloud's security controls and IAM • Better integration: Seamless integration with other Google Cloud services • Advanced features: Access to additional Vertex AI features like model tuning and monitoring • Authentication: Uses Google Cloud Application Default Credentials (ADC) instead of API keys

Configuration:

Set VERTEX_AI_PROJECT environment variable (required)
Set VERTEX_AI_LOCATION environment variable (optional, defaults to us-central1)
Use Google Cloud Application Default Credentials or service account key

Authentication Setup: Option 1 (Recommended): gcloud auth application-default login Option 2: Set GOOGLE_APPLICATION_CREDENTIALS to service account key path

Available Models:

vertex_ai/gemini-2.0-flash
vertex_ai/gemini-2.5-flash
vertex_ai/gemini-2.5-pro

Configuration

Field	Type	Required	Default	Description
`project`	`<class 'str'>`	No	PydanticUndefined	Google Cloud project ID for Vertex AI
`location`	`<class 'str'>`	No	us-central1	Google Cloud location for Vertex AI

Sample Configuration

project: ${env.VERTEX_AI_PROJECT}
location: ${env.VERTEX_AI_LOCATION:=us-central1}

1.4 KiB Raw Blame History

remote::vertexai

Description

Configuration

Sample Configuration

1.4 KiB

Raw Blame History