chore(misc): make tests and starter faster (#3042)

A bunch of miscellaneous cleanup focusing on tests, but ended up speeding up starter distro substantially. - Pulled llama stack client init for tests into `pytest_sessionstart` so it does not clobber output - Profiling of that told me where we were doing lots of heavy imports for starter, so lazied them - starter now starts 20seconds+ faster on my Mac - A few other smallish refactors for `compat_client`
2025-12-03 09:53:45 +00:00 · 2025-08-05 14:55:05 -07:00 · 2025-08-05 14:55:05 -07:00 · 7f834339ba
commit 7f834339ba
parent e12524af85
45 changed files with 2897 additions and 1688 deletions
--- a/docs/source/providers/inference/index.md
+++ b/docs/source/providers/inference/index.md
@ -1,4 +1,4 @@
-# Inference 
+# Inference

 ## Overview

--- a/docs/source/providers/inference/remote_hf_endpoint.md
+++ b/docs/source/providers/inference/remote_hf_endpoint.md
@ -8,7 +8,7 @@ HuggingFace Inference Endpoints provider for dedicated model serving.

 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
-| `endpoint_name` | `<class 'str'>` | No | PydanticUndefined | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
+| `endpoint_name` | `<class 'str'>` | No |  | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
 | `api_token` | `pydantic.types.SecretStr \| None` | No |  | Your Hugging Face user access token (will default to locally saved token if not provided) |

 ## Sample Configuration
--- a/docs/source/providers/inference/remote_hf_serverless.md
+++ b/docs/source/providers/inference/remote_hf_serverless.md
@ -8,7 +8,7 @@ HuggingFace Inference API serverless provider for on-demand model inference.

 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
-| `huggingface_repo` | `<class 'str'>` | No | PydanticUndefined | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
+| `huggingface_repo` | `<class 'str'>` | No |  | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
 | `api_token` | `pydantic.types.SecretStr \| None` | No |  | Your Hugging Face user access token (will default to locally saved token if not provided) |

 ## Sample Configuration
--- a/docs/source/providers/inference/remote_tgi.md
+++ b/docs/source/providers/inference/remote_tgi.md
@ -8,7 +8,7 @@ Text Generation Inference (TGI) provider for HuggingFace model serving.

 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
-| `url` | `<class 'str'>` | No | PydanticUndefined | The URL for the TGI serving endpoint |
+| `url` | `<class 'str'>` | No |  | The URL for the TGI serving endpoint |

 ## Sample Configuration