Commit graph

9 commits

Author SHA1 Message Date
Kai Wu
b63982ef00 working now 2025-07-31 10:19:53 -07:00
Kai Wu
1cb9d3bca2 second try 2025-07-30 14:51:43 -07:00
Kai Wu
e614241876 first draft 2025-07-25 10:41:06 -07:00
ehhuang
51b179e1c5
chore: update k8s template (#2786)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Update ReadTheDocs / update-readthedocs (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Test External Providers / test-external-providers (venv) (push) Failing after 50s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 58s
Unit Tests / unit-tests (3.13) (push) Failing after 54s
Integration Tests / test-matrix (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m40s
# What does this PR do?
- enables auth
- updates to use distribution-starter docker

## Test Plan
bash apply.sh
2025-07-16 15:07:26 -07:00
ehhuang
4cf1952c32
chore: update vllm k8s command to support tool calling (#2717)
# What does this PR do?


## Test Plan
2025-07-10 14:40:17 -07:00
ehhuang
84fa83b788
fix: update k8s templates (#2645)
Some checks failed
Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 9s
Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 12s
Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 12s
Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 15s
Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 13s
Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 17s
Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s
Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 12s
Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 14s
Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 10s
Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 13s
Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 15s
Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s
Python Package Build Test / build (3.12) (push) Failing after 33s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 41s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s
Python Package Build Test / build (3.13) (push) Failing after 33s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Pre-commit / pre-commit (push) Successful in 1m23s
# What does this PR do?
- fix env variables
- use gpu for vllm
- add eks/apply.py for aws
- add template to set hf secret

## Test Plan
bash apply.sh

Co-authored-by: Eric Huang <erichuang@fb.com>
2025-07-08 15:57:01 -07:00
ehhuang
d96f6ec763
chore(ui): use proxy server for backend API calls; simplified k8s deployment (#2350)
# What does this PR do?
- no more CORS middleware needed


## Test Plan
### Local test
llama stack run starter --image-type conda
npm run dev
verify UI works in browser

### Deploy to k8s
temporarily change ui-k8s.yaml.template to load from PR commit
<img width="604" alt="image"
src="https://github.com/user-attachments/assets/87fa2e52-1e93-4e32-9e0f-5b283b7a37b3"
/>

sh ./apply.sh
$ kubectl get services
go to external_ip:8322 and play around with UI
<img width="1690" alt="image"
src="https://github.com/user-attachments/assets/5b7ec827-4302-4435-a9eb-df423676d873"
/>
2025-06-03 14:57:10 -07:00
Ashwin Bharambe
ba25c5e7e1
docs(k8s): add UI template (#2343)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s
Integration Tests / test-matrix (http, inference) (push) Failing after 9s
Integration Tests / test-matrix (http, agents) (push) Failing after 11s
Integration Tests / test-matrix (http, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, inspect) (push) Failing after 10s
Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 8s
Integration Tests / test-matrix (http, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, agents) (push) Failing after 10s
Integration Tests / test-matrix (http, datasets) (push) Failing after 12s
Integration Tests / test-matrix (library, datasets) (push) Failing after 9s
Integration Tests / test-matrix (http, scoring) (push) Failing after 11s
Integration Tests / test-matrix (library, inference) (push) Failing after 8s
Integration Tests / test-matrix (library, inspect) (push) Failing after 9s
Integration Tests / test-matrix (library, scoring) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Integration Tests / test-matrix (library, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s
Integration Tests / test-matrix (library, post_training) (push) Failing after 10s
Unit Tests / unit-tests (3.11) (push) Failing after 7s
Unit Tests / unit-tests (3.10) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Pre-commit / pre-commit (push) Successful in 55s
WIP: add a UI template
2025-06-02 17:55:18 -07:00
Ashwin Bharambe
7fb4bdabea
docs(kubernetes): add more fleshed-out example of a Demo Kubernetes cluster (#2329)
This Kubernetes cluster has:

- vLLM for serving an inference model
- vLLM for serving a safety model
- Postgres DB (for metadata and other state for the Llama Stack distro)
- Chroma DB for Vector IO (memory)

Perhaps most importantly, this was me trying to learn Kubernetes for the
first time.

## Test Plan

Run `sh apply.sh` against an EKS cluster, then after `kubectl
port-forward service/llama-stack-service 8321:8321` and after many
attempts, we have finally:

<img width="1589" alt="image"
src="https://github.com/user-attachments/assets/c69f242d-6aaa-4def-9f7c-172113b8bfc1"
/>

<img width="1978" alt="image"
src="https://github.com/user-attachments/assets/cf678404-f551-4fa5-9077-bebe3e8e8ae8"
/>
2025-06-02 13:07:08 -07:00