llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Dmitry Rogozhkin 7ea14ae62e feat: enable xpu support for meta-reference stack (#558 ) This commit adds support for XPU and CPU devices into meta-reference stack for text models. On creation stack automatically identifies which device to use checking available accelerate capabilities in the following order: CUDA, then XPU, finally CPU. This behaviour can be overwritten with the `DEVICE` environment variable. In this case explicitly specified device will be used. Tested with: ``` torchrun pytest llama_stack/providers/tests/inference/test_text_inference.py -k meta_reference ``` Results: * Tested on: system with single CUDA device, system with single XPU device and on pure CPU system * Results: all test pass except `test_completion_logprobs` * `test_completion_logprobs` fails in the same way as on a baseline, i.e. unrelated with this change: `AssertionError: Unexpected top_k=3` Requires: https://github.com/meta-llama/llama-models/pull/233 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>		2025-01-31 12:11:49 -08:00
..
agents	Fix Agents to support code and rag simultaneously (#908 )	2025-01-30 17:09:34 -08:00
datasetio	Add persistence for localfs datasets (#557 )	2025-01-09 17:34:18 -08:00
eval	rebase eval test w/ tool_runtime fixtures (#773 )	2025-01-15 12:55:19 -08:00
inference	feat: enable xpu support for meta-reference stack (#558 )	2025-01-31 12:11:49 -08:00
ios/inference	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00
post_training	More idiomatic REST API (#765 )	2025-01-15 13:20:09 -08:00
safety	[bugfix] fix llama guard parsing ContentDelta (#772 )	2025-01-15 11:20:23 -08:00
scoring	Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data (#735 )	2025-01-09 11:51:36 -08:00
telemetry	Fix telemetry init (#885 )	2025-01-27 11:20:28 -08:00
tool_runtime	Move tool_runtime.memory -> tool_runtime.rag	2025-01-22 20:25:02 -08:00
vector_io	Bump key for faiss	2025-01-24 12:08:36 -08:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00