llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-27 18:50:41 +00:00

History

Dmitry Rogozhkin 71ed47ea76 docs: add example for intel gpu in vllm remote (#1952 ) # What does this PR do? PR adds instructions to setup vLLM remote endpoint for vllm-remote llama stack distribution. ## Test Plan * Verified with manual tests of the configured vllm-remote against vllm endpoint running on the system with Intel GPU * Also verified with ci pytests (see cmdline below). Test passes in the same capacity as it does on the A10 Nvidia setup (some tests do fail which seems to be known issues with vllm remote llama stack distribution) ``` pytest -s -v tests/integration/inference/test_text_inference.py \ --stack-config=http://localhost:5001 \ --text-model=meta-llama/Llama-3.2-3B-Instruct ``` CC: @ashwinb Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>		2025-04-15 07:56:23 -07:00
..
apis	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
cli	feat: add `--providers` to llama stack build (#1718 )	2025-04-15 14:17:03 +02:00
distribution	fix: FastAPI built-in paths bypass custom routing (Docs) and update r… (#1841 )	2025-04-14 13:28:25 -04:00
models	fix: 100% OpenAI API verification for together and fireworks (#1946 )	2025-04-14 08:56:29 -07:00
providers	feat: allow ollama to use 'latest' if available but not specified (#1903 )	2025-04-14 09:03:54 -07:00
strong_typing	chore: more mypy checks (ollama, vllm, ...) (#1777 )	2025-04-01 17:12:39 +02:00
templates	docs: add example for intel gpu in vllm remote (#1952 )	2025-04-15 07:56:23 -07:00
__init__.py	export LibraryClient	2024-12-13 12:08:00 -08:00
env.py	refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401 )	2025-03-04 14:53:47 -08:00
log.py	chore: Remove style tags from log formatter (#1808 )	2025-03-27 10:18:21 -04:00
schema_utils.py	fix: dont check protocol compliance for experimental methods	2025-04-12 16:26:32 -07:00