llama-stack-mirror

2403 commits 161 branches 116 tags 102 MiB

Author	SHA1	Message	Date
Charlie Doern	49b729b30a	feat: api level request metrics via middleware add RequestMetricsMiddleware which tracks key metrics related to each request the LLS server will recieve: 1. llama_stack_requests_total: tracks the total amount of requests the server has processed 2. llama_stack_request_duration_seconds: tracks the duration of each request 3. llama_stack_concurrent_requests: tracks concurrently processed requests by the server The usage of a middleware allows this to be done on the server level without having to add custom handling to each router like the inference router has today for its API specific metrics. Also, add some unit tests for this functionality resolves #2597 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-03 13:14:25 -04:00
Ashwin Bharambe	2665f00102	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 ) We would like to rename the term `template` to `distribution`. To prepare for that, this is a precursor. cc @leseb	2025-07-30 23:30:53 -07:00

Author

SHA1

Message

Date

Charlie Doern

49b729b30a

feat: api level request metrics via middleware

add RequestMetricsMiddleware which tracks key metrics related to each request the LLS server will recieve:

1. llama_stack_requests_total: tracks the total amount of requests the server has processed
2. llama_stack_request_duration_seconds: tracks the duration of each request
3. llama_stack_concurrent_requests: tracks concurrently processed requests by the server

The usage of a middleware allows this to be done on the server level without having to add custom handling to each router like the inference router has today for its API specific metrics.

Also, add some unit tests for this functionality

resolves #2597

Signed-off-by: Charlie Doern <cdoern@redhat.com>

2025-08-03 13:14:25 -04:00

Ashwin Bharambe

2665f00102

chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )

We would like to rename the term `template` to `distribution`. To
prepare for that, this is a precursor.

cc @leseb

2025-07-30 23:30:53 -07:00

Renamed from llama_stack/distribution/server/server.py (Browse further)

2 commits