mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-12 16:16:09 +00:00
Update default port from 5000 -> 8321
This commit is contained in:
parent
f1faa9c924
commit
03ac84a829
18 changed files with 27 additions and 27 deletions
|
@ -402,11 +402,11 @@ Serving API agents
|
|||
POST /agents/step/get
|
||||
POST /agents/turn/get
|
||||
|
||||
Listening on ['::', '0.0.0.0']:5000
|
||||
Listening on ['::', '0.0.0.0']:8321
|
||||
INFO: Started server process [2935911]
|
||||
INFO: Waiting for application startup.
|
||||
INFO: Application startup complete.
|
||||
INFO: Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit)
|
||||
INFO: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
|
||||
INFO: 2401:db00:35c:2d2b:face:0:c9:0:54678 - "GET /models/list HTTP/1.1" 200 OK
|
||||
```
|
||||
|
||||
|
|
|
@ -27,7 +27,7 @@ If you don't want to run inference on-device, then you can connect to any hosted
|
|||
```swift
|
||||
import LlamaStackClient
|
||||
|
||||
let agents = RemoteAgents(url: URL(string: "http://localhost:5000")!)
|
||||
let agents = RemoteAgents(url: URL(string: "http://localhost:8321")!)
|
||||
let request = Components.Schemas.CreateAgentTurnRequest(
|
||||
agent_id: agentId,
|
||||
messages: [
|
||||
|
|
|
@ -41,7 +41,7 @@ The script will first start up TGI server, then start up Llama Stack distributio
|
|||
INFO: Started server process [1]
|
||||
INFO: Waiting for application startup.
|
||||
INFO: Application startup complete.
|
||||
INFO: Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
|
||||
INFO: Uvicorn running on http://[::]:8321 (Press CTRL+C to quit)
|
||||
```
|
||||
|
||||
To kill the server
|
||||
|
@ -65,7 +65,7 @@ registry.dell.huggingface.co/enterprise-dell-inference-meta-llama-meta-llama-3.1
|
|||
#### Start Llama Stack server pointing to TGI server
|
||||
|
||||
```
|
||||
docker run --network host -it -p 5000:5000 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-tgi --yaml_config /root/my-run.yaml
|
||||
docker run --network host -it -p 8321:8321 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-tgi --yaml_config /root/my-run.yaml
|
||||
```
|
||||
|
||||
Make sure in you `run.yaml` file, you inference provider is pointing to the correct TGI server endpoint. E.g.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue