datasets.rst was removed from torchtune repo.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
Replace a missing 404 document with another one that exists. (Removed it
from
the list when memory_optimizations.rst was already pulled.)
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Before the patch, the example could not be executed verbatim without
copy-pasting client function from the inference example. I think it's
better to have examples self-contained, especially in a getting started
guide.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
See above.
## Test Plan
Confirmed example can now be executed verbatim.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
- **[docs] Fix misc typos and formatting issues in intro docs**
- **[docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started**
- **[docs] Show that `llama-stack-client configure` will ask for api
key**
# What does this PR do?
Miscellaneous fixes in the documentation; not worth reporting an issue.
## Test Plan
No code changes. Addressed issues spotted when walking through the
guide.
Confirmed locally.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
---------
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Podman is a popular alternative to Docker, so it would be nice to make
it clear that it can also be used to deploy the container for the
server. The instructions are a little different because you have to
create the directory (unlike with Docker which makes the directory for
you).
# What does this PR do?
- [ ] Add Podman instructions to Quick Start
## Test Plan
Documentation only.
## Sources
I tried it out and it worked.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
the example script can gracefully exit if the boolean returned from
initialize is used properly
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
The current default system prompt for llama3.2 tends to overindex on
tool calling and doesn't work well when the prompt does not require tool
calling.
This PR adds an option to override the default system prompt, and
organizes tool-related configs into a new config object.
- [ ] Addresses issue (#issue)
## Test Plan
python -m unittest
llama_stack.providers.tests.inference.test_prompt_adapter
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/meta-llama/llama-stack/pull/937).
* #938
* __->__ #937
# What does this PR do?
- Update readme for typescript SDK.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
the llama-stack repo is already ignoring hidden python `.venv/`
directories but not `venv/`
- [ ] Addresses issue (#issue)
## Test Plan
N/A
## Sources
N/A
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
This fixes the following timeout issue when installing PyTorch via uv.
Also see reference: https://github.com/astral-sh/uv/pull/1694,
https://github.com/astral-sh/uv/issues/1549
```
Installing pip dependencies
Using Python 3.10.16 environment at: /home/yutang/.conda/envs/distribution-myenv
× Failed to download and build `antlr4-python3-runtime==4.9.3`
├─▶ Failed to extract archive
├─▶ failed to unpack
│ `/home/yutang/.cache/uv/sdists-v7/.tmpDWX4iK/antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py`
├─▶ failed to unpack
│ `antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py` into
│ `/home/yutang/.cache/uv/sdists-v7/.tmpDWX4iK/antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py`
├─▶ error decoding response body
├─▶ request or response body error
╰─▶ operation timed out
help: `antlr4-python3-runtime` (v4.9.3) was included because `torchtune`
(v0.5.0) depends on `omegaconf` (v2.3.0) which depends on
`antlr4-python3-runtime>=4.9.dev0, <4.10.dev0`
Failed to build target distribution-myenv with return code 1
```
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
This is similar to what we are doing for other projects, e.g.
https://github.com/argoproj/argo-workflows/tree/main/.github/ISSUE_TEMPLATE
The benefits is to give people more options before submitting a bug
report or feature request on GitHub.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
To work with the updated iOSCalendarAssistantWithLocalInf
[here](https://github.com/meta-llama/llama-stack-apps/compare/ios_local).
In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.
- [ ] Addresses issue (#issue)
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Lint check in main branch is failing. This fixes the lint check after we
moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We
need to move to a `ruff.toml` file as well as fixing and ignoring some
additional checks.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
add support to the NVIDIA Inference provider for image inputs
## Test Plan
1. Run local [Llama 3.2 11b vision
instruct](https://build.nvidia.com/meta/llama-3.2-11b-vision-instruct?snippet_tab=Docker)
NIM
2. Start a stack, e.g. `llama stack run
llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=http://localhost:8000`
3. Run image tests, e.g. `LLAMA_STACK_BASE_URL=http://localhost:8321
pytest -v tests/client-sdk/inference/test_inference.py
--vision-inference-model meta-llama/Llama-3.2-11B-Vision-Instruct -k
image`
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
## What does this PR do?
See issue: #747 -- `uv` is just plain better. This PR does the bare
minimum of replacing `pip install` by `uv pip install` and ensuring `uv`
exists in the environment.
## Test Plan
First: create new conda, `uv pip install -e .` on `llama-stack` -- all
is good.
Next: run `llama stack build --template together` followed by `llama
stack run together` -- all good
Next: run `llama stack build --template together --image-name yoyo`
followed by `llama stack run together --image-name yoyo` -- all good
Next: fresh conda and `uv pip install -e .` and `llama stack build
--template together --image-type venv` -- all good.
Docker: `llama stack build --template together --image-type container`
works!