llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-10 13:28:40 +00:00

History

skamenan7 c752209ee4 fix bedrock inference profile IDs for AWS regions AWS switched to requiring region-prefixed inference profile IDs instead of foundation model IDs. Added auto-detection based on boto3 client region to convert model IDs like meta.llama3-1-70b-instruct-v1:0 to us.meta.llama3-1-70b-instruct-v1:0 depending on the region. Handles edge cases like ARNs, case sensitivity, and None regions. This fixes the ValidationException error when using on-demand throughput.		2025-09-10 09:18:07 -04:00
..
agent	fix: Fix list_sessions() (#3114 )	2025-08-13 07:46:26 -07:00
agents	fix: ensure assistant message is followed by tool call message as expected by openai (#3224 )	2025-08-22 10:42:03 -07:00
batches	feat(batches, completions): add /v1/completions support to /v1/batches (#3309 )	2025-09-05 11:59:57 -07:00
files	feat(files, s3, expiration): add expires_after support to S3 files provider (#3283 )	2025-08-29 16:17:24 -07:00
inference	chore: update the groq inference impl to use openai-python for openai-compat functions (#3348 )	2025-09-06 15:36:27 -07:00
nvidia	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
utils	feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064 )	2025-08-29 16:30:12 +02:00
vector_io	fix: Fix mock vector DB schema in Qdrant tests (#3295 )	2025-09-03 09:59:16 +02:00
test_bedrock.py	fix bedrock inference profile IDs for AWS regions	2025-09-10 09:18:07 -04:00
test_configs.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00