This commit is contained in:
Charlie Doern 2025-09-18 17:04:04 +02:00 committed by GitHub
commit 6d68ece4ef
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 433 additions and 0 deletions

View file

@ -37,6 +37,9 @@ The following metrics are automatically generated for each inference request:
| `llama_stack_prompt_tokens_total` | Counter | `tokens` | Number of tokens in the input prompt | `model_id`, `provider_id` |
| `llama_stack_completion_tokens_total` | Counter | `tokens` | Number of tokens in the generated response | `model_id`, `provider_id` |
| `llama_stack_tokens_total` | Counter | `tokens` | Total tokens used (prompt + completion) | `model_id`, `provider_id` |
| `llama_stack_requests_total` | Counter | `requests` | Total number of requests | `api`, `status` |
| `llama_stack_request_duration_seconds` | Gauge | `seconds` | Request duration | `api`, `status` |
| `llama_stack_concurrent_requests` | Gauge | `requests` | Number of concurrent requests | `api` |
#### Metric Generation Flow