llama-stack-mirror/llama_stack/core/server
Ashwin Bharambe 359d3eeff2 fix(inference): propagate 401 errors
Remote provider authentication errors (401/403) were being converted to 500 Internal Server Error, hiding the real cause from users.

Now checks if exceptions have a status_code attribute and preserves it. This fixes authentication error handling for all remote inference providers using OpenAI SDK (groq, openai, together, fireworks, etc.) and similar provider SDKs.

Before:
- HTTP 500: "Internal server error: An unexpected error occurred."

After:
- HTTP 401: "Error code: 401 - Invalid API Key"

Fixes #2990

Test Plan:
1. Build stack: llama stack build --image-type venv --providers inference=remote::groq
2. Start stack: llama stack run
3. Send request with invalid API key via x-llamastack-provider-data header
4. Verify response is 401 with provider error message (not 500)
5. Repeat for openai, together, fireworks providers
2025-10-09 17:47:41 -07:00
..
__init__.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
auth.py refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
auth_providers.py feat: Add Kubernetes auth provider to use SelfSubjectReview and kubernetes api server (#2559) 2025-09-08 11:25:10 +02:00
quota.py refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
routes.py feat: introduce API leveling, post_training, eval to v1alpha (#3449) 2025-09-26 16:18:07 +02:00
server.py fix(inference): propagate 401 errors 2025-10-09 17:47:41 -07:00
tracing.py feat: introduce API leveling, post_training, eval to v1alpha (#3449) 2025-09-26 16:18:07 +02:00