llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-09 05:08:37 +00:00

Author	SHA1	Message	Date
Charlie Doern	49b729b30a	feat: api level request metrics via middleware add RequestMetricsMiddleware which tracks key metrics related to each request the LLS server will recieve: 1. llama_stack_requests_total: tracks the total amount of requests the server has processed 2. llama_stack_request_duration_seconds: tracks the duration of each request 3. llama_stack_concurrent_requests: tracks concurrently processed requests by the server The usage of a middleware allows this to be done on the server level without having to add custom handling to each router like the inference router has today for its API specific metrics. Also, add some unit tests for this functionality resolves #2597 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-03 13:14:25 -04:00
IAN MILLER	a749d5f4a4	refactor: remove Conda support from Llama Stack (#2969 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for removal of Conda support in Llama Stack <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2539 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-02 15:52:59 -07:00
Nathan Weinberg	ffb6306fbd	fix: remove redundant code from unregister_vector_db (#2983 ) get_vector_db() will raise an exception if a vector store won't be returned client handling is redundant Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-31 09:22:04 -07:00
Ashwin Bharambe	2665f00102	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 ) We would like to rename the term `template` to `distribution`. To prepare for that, this is a precursor. cc @leseb	2025-07-30 23:30:53 -07:00

4 commits