feat: add a configurable category-based logger (#1352)

A self-respecting server needs good observability which starts with
configurable logging. Llama Stack had little until now. This PR adds a
`logcat` facility towards that. Callsites look like:

```python
logcat.debug("inference", f"params to ollama: {params}")
```

- the first parameter is a category. there is a static list of
categories in `llama_stack/logcat.py`
- each category can be associated with a log-level which can be
configured via the `LLAMA_STACK_LOGGING` env var.
- a value `LLAMA_STACK_LOGGING=inference=debug;server=info"` does the
obvious thing. there is a special key called `all` which is an alias for
all categories

## Test Plan

Ran with `LLAMA_STACK_LOGGING="all=debug" llama stack run fireworks` and
saw the following:


![image](https://github.com/user-attachments/assets/d24b95ab-3941-426c-9ea0-a4c62542e6f0)

Hit it with a client-sdk test case and saw this:


![image](https://github.com/user-attachments/assets/3fee8c6c-986e-4125-a09c-f5dc019682e2)
This commit is contained in:
Ashwin Bharambe 2025-03-02 18:51:14 -08:00 committed by GitHub
parent a9a7b11326
commit 754feba61f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 409 additions and 47 deletions

View file

@ -8,6 +8,7 @@ from typing import AsyncGenerator, AsyncIterator, List, Optional, Union
import litellm
from llama_stack import logcat
from llama_stack.apis.common.content_types import (
InterleavedContent,
InterleavedContentItem,
@ -106,6 +107,8 @@ class LiteLLMOpenAIMixin(
)
params = await self._get_params(request)
logcat.debug("inference", f"params to litellm (openai compat): {params}")
# unfortunately, we need to use synchronous litellm.completion here because litellm
# caches various httpx.client objects in a non-eventloop aware manner
response = litellm.completion(**params)