Commit graph

4 commits

Author SHA1 Message Date
Eran Cohen
1f421238b8 feat: Add Google Vertex AI inference provider support
- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration

Signed-off-by: Eran Cohen <eranco@redhat.com>
2025-07-24 09:49:23 +03:00
ehhuang
8e1a2b4703
chore: remove *_openai_compat providers (#2849)
# What does this PR do?
These are no longer needed as llama-stack-evals can run against OAI
endpoints directly.

## Test Plan
2025-07-22 10:25:36 -07:00
Ashwin Bharambe
ade075152e
chore: kill inline::vllm (#2824)
Inline _inference_ providers haven't proved to be very useful -- they
are rarely used. And for good reason -- it is almost never a good idea
to include a complex (distributed) inference engine bundled into a
distributed stateful front-end server serving many other things.
Responsibility should be split properly.

See Discord discussion:
1395849853
2025-07-18 15:52:18 -07:00
Sébastien Han
c9a49a80e8
docs: auto generated documentation for providers (#2543)
# What does this PR do?

Simple approach to get some provider pages in the docs.

Add or update description fields in the provider configuration class
using Pydantic’s Field, ensuring these descriptions are clear and
complete, as they will be used to auto-generate provider documentation
via ./scripts/distro_codegen.py instead of editing the docs manually.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-06-30 15:13:20 +02:00