refactor: address PR feedback - improve naming, error handling, and documentation

Address all feedback from PR #3962:

**Code Quality Improvements:**
- Rename `_uvicorn_run` → `_run_server` for accurate method naming
- Refactor error handling: move Gunicorn fallback logic from `_run_with_gunicorn` to caller
- Update comments to reflect both Uvicorn and Gunicorn behavior
- Update test mock from `_uvicorn_run` to `_run_server`

**Environment Variable:**
- Change `LLAMA_STACK_DISABLE_GUNICORN` → `LLAMA_STACK_ENABLE_GUNICORN`
- More intuitive positive logic (no double negatives)
- Defaults to `true` on Unix systems
- Clearer log messages distinguishing platform limitations vs explicit disable

**Documentation:**
- Remove unnecessary `uv sync --group unit --group test` from user docs
- Clarify SQLite limitations: "SQLite only allows one writer at a time"
- Accurate explanation: WAL mode enables concurrent reads but writes are serialized
- Strong recommendation for PostgreSQL in production with high traffic

**Architecture:**
- Better separation of concerns: `_run_with_gunicorn` just executes, caller handles fallback
- Exceptions propagate to caller for centralized decision making

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Roy Belio 2025-11-04 16:22:12 +02:00
parent 9ff881a28a
commit 241e189fee
10 changed files with 75 additions and 63 deletions

View file

@ -295,8 +295,8 @@ def test_providers_flag_generates_config_with_api_keys():
enable_ui=False,
)
# Mock _uvicorn_run to prevent starting a server
with patch.object(stack_run, "_uvicorn_run"):
# Mock _run_server to prevent starting a server
with patch.object(stack_run, "_run_server"):
stack_run._run_stack_run_cmd(args)
# Read the generated config file