fix: docker run with --pull always to fetch the latest image (#1733)

As titled
This commit is contained in:
Hardik Shah 2025-03-20 15:35:48 -07:00 committed by GitHub
parent f95bc29ca9
commit 581e8ae562
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
31 changed files with 57 additions and 3 deletions

View file

@ -53,7 +53,7 @@ docker compose down
#### Start Dell-TGI server locally
```
docker run -it --shm-size 1g -p 80:80 --gpus 4 \
docker run -it --pull always --shm-size 1g -p 80:80 --gpus 4 \
-e NUM_SHARD=4
-e MAX_BATCH_PREFILL_TOKENS=32768 \
-e MAX_INPUT_TOKENS=8000 \
@ -65,7 +65,7 @@ registry.dell.huggingface.co/enterprise-dell-inference-meta-llama-meta-llama-3.1
#### Start Llama Stack server pointing to TGI server
```
docker run --network host -it -p 8321:8321 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-tgi --yaml_config /root/my-run.yaml
docker run --pull always --network host -it -p 8321:8321 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-tgi --yaml_config /root/my-run.yaml
```
Make sure in you `run.yaml` file, you inference provider is pointing to the correct TGI server endpoint. E.g.