llama-stack-mirror/llama_stack/providers/remote/inference
mergify[bot] 2d5ed5d0f5
fix: update hard-coded google model names (backport #4212) (#4229)
# What does this PR do?
When we send the model names to Google's openai API, we must use the
"google" name prefix. Google does not recognize the "vertexai" model
names.

Closes #4211

## Test Plan
```bash
uv venv --python python312
. .venv/bin/activate
llama stack list-deps starter | xargs -L1 uv pip install
llama stack run starter
```

Test that this shows the gemini models with their correct names:
```bash
curl http://127.0.0.1:8321/v1/models | jq '.data | map(select(.custom_metadata.provider_id == "vertexai"))'
```

Test that this chat completion works:
```bash
curl -X POST   -H "Content-Type: application/json"   "http://127.0.0.1:8321/v1/chat/completions"   -d '{
        "model": "vertexai/google/gemini-2.5-flash",
        "messages": [
          {
            "role": "system",
            "content": "You are a helpful assistant."
          },
          {
            "role": "user",
            "content": "Hello! Can you tell me a joke?"
          }
        ],
        "temperature": 1.0,
        "max_tokens": 256
      }'
```<hr>This is an automatic backport of pull request #4212 done by
[Mergify](https://mergify.com).

Signed-off-by: Charlie Doern <cdoern@redhat.com>
Co-authored-by: Ken Dreyer <kdreyer@redhat.com>
2025-11-24 11:32:14 -08:00
..
anthropic feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
azure feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
bedrock feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
cerebras feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
databricks feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
fireworks feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
gemini feat(gemini): Support gemini-embedding-001 and fix models/ prefix in metadata keys (#3813) 2025-10-15 12:22:10 -04:00
groq feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
llama_openai_compat feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
nvidia chore: remove build.py (#3869) 2025-10-20 16:28:15 -07:00
ollama feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
openai feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
passthrough feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
runpod feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
sambanova feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
tgi feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
together feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
vertexai fix: update hard-coded google model names (backport #4212) (#4229) 2025-11-24 11:32:14 -08:00
vllm feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
watsonx fix: Fixed WatsonX remote inference provider (#3801) 2025-10-14 14:52:32 +02:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00