llama-stack-mirror/docs/source/distributions
ehhuang bcc7f2c7d0
chore: async inference store write (#3318)
# What does this PR do?


## Test Plan
```
cd /docs/source/distributions/k8s-benchmark
# start mock server
python openai-mock-server.py --port 8000
# start stack server
uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml
# run benchmark script
uv run python3 benchmark.py --duration 30 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct
```
Before:

============================================================
BENCHMARK RESULTS
============================================================
Total time: 30.00s
Concurrent users: 50
Total requests: 1267
Successful requests: 1267
Failed requests: 0
Success rate: 100.0%
Requests per second: 42.23


After:

============================================================
BENCHMARK RESULTS
============================================================
Total time: 30.00s
Concurrent users: 50
Total requests: 1449
Successful requests: 1449
Failed requests: 0
Success rate: 100.0%
Requests per second: 48.30
2025-09-04 11:37:46 -07:00
..
eks fix: update k8s templates (#2645) 2025-07-08 15:57:01 -07:00
k8s chore: remove absolute paths (#3263) 2025-08-27 12:04:25 -07:00
k8s-benchmark chore: async inference store write (#3318) 2025-09-04 11:37:46 -07:00
ondevice_distro chore: remove absolute paths (#3263) 2025-08-27 12:04:25 -07:00
remote_hosted_distro refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00
self_hosted_distro docs: add VLM NIM example (#3277) 2025-08-29 16:23:52 -07:00
building_distro.md fix(docs): update llama stack build CLI doc (#3050) 2025-08-06 09:32:09 -07:00
configuration.md feat: Add CORS configuration support for server (#3201) 2025-08-21 14:23:27 -07:00
customizing_run_yaml.md docs: clarify run.yaml files are starting points for customization (#2746) 2025-07-14 09:53:13 -07:00
importing_as_library.md chore: remove absolute paths (#3263) 2025-08-27 12:04:25 -07:00
index.md docs: part 1 - fix warnings in documentation generation (#2861) 2025-07-30 10:50:10 -07:00
list_of_distributions.md fix: Restore the nvidia distro (#2639) 2025-07-07 15:50:05 -07:00
starting_llama_stack_server.md refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00