forked from phoenix-oss/llama-stack-mirror

History

Dinesh Yeduguru ab7f802698 feat: add MetricResponseMixin to chat completion response types (#1050 ) # What does this PR do? Defines a MetricResponseMixin which can be inherited by any response class. Adds it to chat completion response types. This is a short term solution to allow inference API to return metrics The ideal way to do this is to have a way for all response types to include metrics and all metric events logged to the telemetry API to be included with the response To do this, we will need to augment all response types with a metrics field. We have hit a blocker from stainless SDK that prevents us from doing this. The blocker is that if we were to augment the response types that have a data field in them like so class ListModelsResponse(BaseModel): metrics: Optional[List[MetricEvent]] = None data: List[Models] ... The client SDK will need to access the data by using a .data field, which is not ergonomic. Stainless SDK does support unwrapping the response type, but it requires that the response type to only have a single field. We will need a way in the client SDK to signal that the metrics are needed and if they are needed, the client SDK has to return the full response type without unwrapping it. ## Test Plan sh run_openapi_generator.sh ./ sh stainless_sync.sh dineshyv/dev add-metrics-to-resp-v4 LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/fireworks/fireworks-run.yaml" pytest -v tests/client-sdk/agents/test_agents.py		2025-02-11 14:58:12 -08:00
..
_static	feat: add MetricResponseMixin to chat completion response types (#1050 )	2025-02-11 14:58:12 -08:00
notebooks	notebook point to github as source of truth	2025-02-03 15:08:25 -08:00
openapi_generator	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
resources	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
source	fix: a bad newline in ollama docs (#1036 )	2025-02-10 14:27:17 -08:00
zero_to_hero_guide	docs: Correct typos in Zero to Hero guide (#997 )	2025-02-06 17:29:52 -05:00
conftest.py	No spaces in ipynb tests	2025-02-07 11:56:22 -08:00
contbuild.sh	Fix broken links with docs	2024-11-22 20:42:17 -08:00
dog.jpg	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
getting_started.ipynb	Getting started notebook update (#936 )	2025-02-07 15:36:15 -08:00
license_header.txt	Initial commit	2024-07-23 08:32:33 -07:00
make.bat	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
Makefile	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
readme.md	Fix README.md notebook links (#976 )	2025-02-05 14:33:46 -08:00
requirements.txt	[docs] add playground ui docs (#592 )	2024-12-12 10:40:38 -08:00

readme.md

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks:

Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack