forked from phoenix-oss/llama-stack-mirror
docs: miscellaneous small fixes (#961)
- **[docs] Fix misc typos and formatting issues in intro docs** - **[docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started** - **[docs] Show that `llama-stack-client configure` will ask for api key** # What does this PR do? Miscellaneous fixes in the documentation; not worth reporting an issue. ## Test Plan No code changes. Addressed issues spotted when walking through the guide. Confirmed locally. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
This commit is contained in:
parent
b84ab6c6b8
commit
0cbb3e401c
3 changed files with 11 additions and 7 deletions
|
@ -23,7 +23,7 @@ Which templates / distributions to choose depends on the hardware you have for r
|
|||
- {dockerhub}`distribution-together` ([Guide](self_hosted_distro/together))
|
||||
- {dockerhub}`distribution-fireworks` ([Guide](self_hosted_distro/fireworks))
|
||||
|
||||
- **Do you want to run Llama Stack inference on your iOS / Android device** Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device:
|
||||
- **Do you want to run Llama Stack inference on your iOS / Android device?** Lastly, we also provide templates for running Llama Stack inference on your iOS / Android device:
|
||||
- [iOS SDK](ondevice_distro/ios_sdk)
|
||||
- [Android](ondevice_distro/android_sdk)
|
||||
|
||||
|
|
|
@ -25,7 +25,9 @@ The `llamastack/distribution-ollama` distribution consists of the following prov
|
|||
| vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
|
||||
|
||||
|
||||
You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.### Environment Variables
|
||||
You should use this distribution if you have a regular desktop machine without very powerful GPUs. Of course, if you have powerful GPUs, you can still continue using this distribution since Ollama supports GPU acceleration.
|
||||
|
||||
### Environment Variables
|
||||
|
||||
The following environment variables can be configured:
|
||||
|
||||
|
|
|
@ -42,8 +42,8 @@ To get started quickly, we provide various container images for the server compo
|
|||
|
||||
Lets setup some environment variables that we will use in the rest of the guide.
|
||||
```bash
|
||||
INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
|
||||
LLAMA_STACK_PORT=8321
|
||||
export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
|
||||
export LLAMA_STACK_PORT=8321
|
||||
```
|
||||
|
||||
Next you can create a local directory to mount into the container’s file system.
|
||||
|
@ -82,8 +82,10 @@ pip install llama-stack-client
|
|||
Let's use the `llama-stack-client` CLI to check the connectivity to the server.
|
||||
|
||||
```bash
|
||||
llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT
|
||||
llama-stack-client models list
|
||||
$ llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT
|
||||
> Enter the API key (leave empty if no key is needed):
|
||||
Done! You can now use the Llama Stack Client CLI with endpoint http://localhost:8321
|
||||
$ llama-stack-client models list
|
||||
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
|
||||
┃ identifier ┃ provider_id ┃ provider_resource_id ┃ metadata ┃
|
||||
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue