llama-stack-mirror/llama_stack/providers/remote/inference
Ken Dreyer 085126e530 fix: update hard-coded google model names (#4212)
When we send the model names to Google's openai API, we must use the
"google" name prefix. Google does not recognize the "vertexai" model
names.

Closes #4211

```bash
uv venv --python python312
. .venv/bin/activate
llama stack list-deps starter | xargs -L1 uv pip install
llama stack run starter
```

Test that this shows the gemini models with their correct names:
```bash
curl http://127.0.0.1:8321/v1/models | jq '.data | map(select(.custom_metadata.provider_id == "vertexai"))'
```

Test that this chat completion works:
```bash
curl -X POST   -H "Content-Type: application/json"   "http://127.0.0.1:8321/v1/chat/completions"   -d '{
        "model": "vertexai/google/gemini-2.5-flash",
        "messages": [
          {
            "role": "system",
            "content": "You are a helpful assistant."
          },
          {
            "role": "user",
            "content": "Hello! Can you tell me a joke?"
          }
        ],
        "temperature": 1.0,
        "max_tokens": 256
      }'
```

(cherry picked from commit dabebdd230)
Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-24 14:17:12 -05:00
..
anthropic feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
azure feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
bedrock feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
cerebras feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
databricks feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
fireworks feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
gemini feat(gemini): Support gemini-embedding-001 and fix models/ prefix in metadata keys (#3813) 2025-10-15 12:22:10 -04:00
groq feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
llama_openai_compat feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
nvidia chore: remove build.py (#3869) 2025-10-20 16:28:15 -07:00
ollama feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
openai feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
passthrough feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
runpod feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
sambanova feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
tgi feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
together feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
vertexai fix: update hard-coded google model names (#4212) 2025-11-24 14:17:12 -05:00
vllm feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777) 2025-10-10 16:21:44 -07:00
watsonx fix: Fixed WatsonX remote inference provider (#3801) 2025-10-14 14:52:32 +02:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00