forked from phoenix-oss/llama-stack-mirror
# What does this PR do? Users prefer to rely on the main CLI rather than invoking the server through a Python module. Users interact with a high-level CLI rather than needing to know internal module structures. Now, when running llama stack run <path-to-config>, the server will attempt to use the system package or a virtual environment if one is active. This also eliminates the current process dependency chain when running from a virtual environment: -> llama stack run -> start_env.sh -> python -m server... Signed-off-by: Sébastien Han <seb@redhat.com> [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Run: ``` ollama run llama3.2:3b-instruct-fp16 --keepalive=2m & llama stack run ./llama_stack/templates/ollama/run.yaml --disable-ipv6 ``` Notice that the server starts and shutdowns normally. [//]: # (## Documentation) --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|---|---|---|
| .. | ||
| routers | ||
| server | ||
| store | ||
| ui | ||
| utils | ||
| __init__.py | ||
| build.py | ||
| build_conda_env.sh | ||
| build_container.sh | ||
| build_venv.sh | ||
| client.py | ||
| common.sh | ||
| configure.py | ||
| datatypes.py | ||
| distribution.py | ||
| inspect.py | ||
| library_client.py | ||
| request_headers.py | ||
| resolver.py | ||
| stack.py | ||
| start_stack.sh | ||