llama-stack-mirror/llama_stack
Sumanth Kamenani 2838d5a20f
fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386)
Fixes #3370

AWS switched to requiring region-prefixed inference profile IDs instead
of foundation model IDs for on-demand throughput. This was causing
ValidationException errors.

Added auto-detection based on boto3 client region to convert model IDs
like meta.llama3-1-70b-instruct-v1:0 to
us.meta.llama3-1-70b-instruct-v1:0 depending on the detected region.

Also handles edge cases like ARNs, case insensitive regions, and None
regions.

Tested with this request.
```json
{
  "model_id": "meta.llama3-1-8b-instruct-v1:0",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "tell me a riddle"
    }
  ],
  "sampling_params": {
     "strategy": {
        "type": "top_p",
        "temperature": 0.7,
        "top_p": 0.9
      },
      "max_tokens": 512
  }
}
```
<img width="1488" height="878" alt="image"
src="https://github.com/user-attachments/assets/0d61beec-3869-4a31-8f37-9f554c280b88"
/>
2025-09-11 11:41:53 +02:00
..
apis feat: Adding OpenAI Prompts API (#3319) 2025-09-08 11:05:13 -04:00
cli feat: include a default inference store during llama stack build (#3373) 2025-09-09 15:54:58 -07:00
core chore: introduce write queue for inference_store (#3383) 2025-09-10 11:57:42 -07:00
distributions fix: Fix locations of distrubution runtime directories (#3336) 2025-09-05 14:09:36 +02:00
models refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
providers fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386) 2025-09-11 11:41:53 +02:00
strong_typing chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
testing fix: environment variable typo in inference recorder error message (#3374) 2025-09-08 17:51:38 +02:00
ui chore(ui-deps): bump tailwindcss from 4.1.6 to 4.1.13 in /llama_stack/ui (#3362) 2025-09-10 13:18:14 -07:00
__init__.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061) 2025-08-20 07:15:35 -04:00
schema_utils.py feat(auth): API access control (#2822) 2025-07-24 15:30:48 -07:00