llama-stack-mirror/docs/source
ehhuang 2c06b24c77
test: benchmark scripts (#3160)
# What does this PR do?
1. Add our own benchmark script instead of locust (doesn't support
measuring streaming latency well)
2. Simplify k8s deployment
3. Add a simple profile script for locally running server

## Test Plan
❮ ./run-benchmark.sh --target stack --duration 180 --concurrent 10

============================================================
BENCHMARK RESULTS
============================================================
Total time: 180.00s
Concurrent users: 10
Total requests: 1636
Successful requests: 1636
Failed requests: 0
Success rate: 100.0%
Requests per second: 9.09

Response Time Statistics:
  Mean: 1.095s
  Median: 1.721s
  Min: 0.136s
  Max: 3.218s
  Std Dev: 0.762s

Percentiles:
  P50: 1.721s
  P90: 1.751s
  P95: 1.756s
  P99: 1.796s

Time to First Token (TTFT) Statistics:
  Mean: 0.037s
  Median: 0.037s
  Min: 0.023s
  Max: 0.211s
  Std Dev: 0.011s

TTFT Percentiles:
  P50: 0.037s
  P90: 0.040s
  P95: 0.044s
  P99: 0.055s

Streaming Statistics:
  Mean chunks per response: 64.0
  Total chunks received: 104775
2025-08-15 11:24:29 -07:00
..
advanced_apis chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
apis chore: bump min python version in docs and tests (#3103) 2025-08-12 08:52:57 -07:00
building_applications docs: Update blocks formatting in docs/source files (#3120) 2025-08-13 08:06:31 -07:00
concepts Revert "feat: add batches API with OpenAI compatibility" (#3149) 2025-08-14 10:08:54 -07:00
contributing test: benchmark scripts (#3160) 2025-08-15 11:24:29 -07:00
deploying chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
distributions test: benchmark scripts (#3160) 2025-08-15 11:24:29 -07:00
getting_started docs: Added comment about a known limitation of AgentEventLogger (#2930) 2025-08-07 10:09:57 -07:00
providers Revert "feat: add batches API with OpenAI compatibility" (#3149) 2025-08-14 10:08:54 -07:00
references docs: Update blocks formatting in docs/source files (#3120) 2025-08-13 08:06:31 -07:00
conf.py docs: Add documentation on how to contribute a Vector DB provider and update testing documentation (#3093) 2025-08-11 11:11:09 -07:00
index.md docs: Reorganize documentation on the webpage (#2651) 2025-07-15 14:19:35 -07:00