llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Dmitry Rogozhkin 7ea14ae62e feat: enable xpu support for meta-reference stack (#558 ) This commit adds support for XPU and CPU devices into meta-reference stack for text models. On creation stack automatically identifies which device to use checking available accelerate capabilities in the following order: CUDA, then XPU, finally CPU. This behaviour can be overwritten with the `DEVICE` environment variable. In this case explicitly specified device will be used. Tested with: ``` torchrun pytest llama_stack/providers/tests/inference/test_text_inference.py -k meta_reference ``` Results: * Tested on: system with single CUDA device, system with single XPU device and on pure CPU system * Results: all test pass except `test_completion_logprobs` * `test_completion_logprobs` fails in the same way as on a baseline, i.e. unrelated with this change: `AssertionError: Unexpected top_k=3` Requires: https://github.com/meta-llama/llama-models/pull/233 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>		2025-01-31 12:11:49 -08:00
..
inline	feat: enable xpu support for meta-reference stack (#558 )	2025-01-31 12:11:49 -08:00
registry	Move runpod provider to the correct directory	2025-01-23 12:25:12 -08:00
remote	SambaNova supports Llama 3.3 (#905 )	2025-01-30 09:24:46 -08:00
tests	[#432 ] Groq Provider tool call tweaks (#811 )	2025-01-29 12:02:12 -08:00
utils	fix ImageContentItem to take base64 string as image.data (#909 )	2025-01-30 15:58:23 -08:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	[memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828 )	2025-01-22 09:59:30 -08:00