forked from phoenix-oss/llama-stack-mirror
		
	| # What does this PR do? Closes #1968. The asynchronous client in `VLLMInferenceAdapter` is now initialized directly before first use and not in `VLLMInferenceAdapter.initialize`. This prevents issues arising due to accessing an expired event loop from a completed `asyncio.run`. ## Test Plan Ran unit tests, including `test_remote_vllm.py`. Ran the code snippet mentioned in #1968. --------- Co-authored-by: Sébastien Han <seb@redhat.com> | ||
|---|---|---|
| .. | ||
| agents | ||
| datasetio | ||
| inference | ||
| post_training | ||
| safety | ||
| tool_runtime | ||
| vector_io | ||
| __init__.py | ||