mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-07 02:47:21 +00:00
# What does this PR do? Refactor main to split out the app construction so that we can use `uvicorn --workers` to enable multi-process stack. ## Test Plan CI > uv run --with llama-stack python -m llama_stack.core.server.server benchmarking/k8s-benchmark/stack_run_config.yaml works. > LLAMA_STACK_CONFIG=benchmarking/k8s-benchmark/stack_run_config.yaml uv run uvicorn llama_stack.core.server.server:create_app --port 8321 --workers 4 works. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| auth.py | ||
| auth_providers.py | ||
| quota.py | ||
| routes.py | ||
| server.py | ||