mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 12:07:34 +00:00
Merge 49b729b30a
into 8422bd102a
This commit is contained in:
commit
6d68ece4ef
4 changed files with 433 additions and 0 deletions
|
@ -37,6 +37,9 @@ The following metrics are automatically generated for each inference request:
|
|||
| `llama_stack_prompt_tokens_total` | Counter | `tokens` | Number of tokens in the input prompt | `model_id`, `provider_id` |
|
||||
| `llama_stack_completion_tokens_total` | Counter | `tokens` | Number of tokens in the generated response | `model_id`, `provider_id` |
|
||||
| `llama_stack_tokens_total` | Counter | `tokens` | Total tokens used (prompt + completion) | `model_id`, `provider_id` |
|
||||
| `llama_stack_requests_total` | Counter | `requests` | Total number of requests | `api`, `status` |
|
||||
| `llama_stack_request_duration_seconds` | Gauge | `seconds` | Request duration | `api`, `status` |
|
||||
| `llama_stack_concurrent_requests` | Gauge | `requests` | Number of concurrent requests | `api` |
|
||||
|
||||
#### Metric Generation Flow
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue