fix: Propagate the runtime error message to user (#4150)

# What does this PR do?
For Runtime Exception the error is not propagated to the user and can be
opaque.
Before fix:
`ERROR - Error processing message: Error code: 500 - {'detail':
'Internal server error: An unexpected error occurred.'}
`
After fix:
`[ERROR] Error code: 404 - {'detail': "Model
'claude-sonnet-4-5-20250929' not found. Use 'client.models.list()' to
list available Models."}
`

(Ran into this few times, while working with OCI + LLAMAStack and Sabre:
Agentic framework integrations with LLAMAStack)

## Test Plan
CI
This commit is contained in:
slekkala1 2025-11-14 13:14:49 -08:00 committed by GitHub
parent eb545034ab
commit f596f850bf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -16,6 +16,7 @@ from llama_stack_api import (
ApprovalFilter,
Inference,
MCPListToolsTool,
ModelNotFoundError,
OpenAIAssistantMessageParam,
OpenAIChatCompletion,
OpenAIChatCompletionChunk,
@ -323,6 +324,8 @@ class StreamingResponseOrchestrator:
if last_completion_result and last_completion_result.finish_reason == "length":
final_status = "incomplete"
except ModelNotFoundError:
raise
except Exception as exc: # noqa: BLE001
self.final_messages = messages.copy()
self.sequence_number += 1