llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-30 19:23:52 +00:00

History

Dmitry Rogozhkin 241a42bb26 docs: add example for intel gpu in vllm remote PR adds instructions to setup vLLM remote endpoint for vllm-remote llama stack distribution. * Verified with manual tests of the configured vllm-remote against vllm endpoint running on the system with Intel GPU * Also verified with ci pytests (see cmdline below). Test passes in the same capacity as it does on the A10 Nvidia setup (some tests do fail which seems to be known issues with vllm remote llama stack distribution) ``` pytest -s -v tests/integration/inference/test_text_inference.py \ --stack-config=http://localhost:5001 \ --text-model=meta-llama/Llama-3.2-3B-Instruct ``` Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>		2025-04-15 07:15:37 -07:00
..
apis	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
cli	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
distribution	fix: Playground RAG page errors (#1928 )	2025-04-10 13:38:31 -07:00
models	fix: on-the-fly int4 quantize parameter (#1920 )	2025-04-09 15:00:12 -07:00
providers	fix: use torchao 0.8.0 for inference (#1925 )	2025-04-10 13:39:20 -07:00
strong_typing	chore: more mypy checks (ollama, vllm, ...) (#1777 )	2025-04-01 17:12:39 +02:00
templates	docs: add example for intel gpu in vllm remote	2025-04-15 07:15:37 -07:00
__init__.py	export LibraryClient	2024-12-13 12:08:00 -08:00
env.py	refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401 )	2025-03-04 14:53:47 -08:00
log.py	chore: Remove style tags from log formatter (#1808 )	2025-03-27 10:18:21 -04:00
schema_utils.py	chore: make mypy happy with webmethod (#1758 )	2025-03-22 08:17:23 -07:00