llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 20:14:13 +00:00

History

Ilya Kolchinsky deee355952 fix: Added lazy initialization of the remote vLLM client to avoid issues with expired asyncio event loop (#1969 ) # What does this PR do? Closes #1968. The asynchronous client in `VLLMInferenceAdapter` is now initialized directly before first use and not in `VLLMInferenceAdapter.initialize`. This prevents issues arising due to accessing an expired event loop from a completed `asyncio.run`. ## Test Plan Ran unit tests, including `test_remote_vllm.py`. Ran the code snippet mentioned in #1968. --------- Co-authored-by: Sébastien Han <seb@redhat.com>		2025-04-23 15:33:19 +02:00
..
agents	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
datasetio	refactor: extract pagination logic into shared helper function (#1770 )	2025-03-31 13:08:29 -07:00
inference	fix: Added lazy initialization of the remote vLLM client to avoid issues with expired asyncio event loop (#1969 )	2025-04-23 15:33:19 +02:00
post_training	fix: Handle case when Customizer Job status is unknown (#1965 )	2025-04-17 10:27:07 +02:00
safety	docs: Add NVIDIA platform distro docs (#1971 )	2025-04-17 05:54:30 -07:00
tool_runtime	fix(api): don't return list for runtime tools (#1686 )	2025-04-01 09:53:11 +02:00
vector_io	chore: Updating Milvus Client calls to be non-blocking (#1830 )	2025-03-28 22:14:07 -04:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00