Commit graph

1151 commits

Author SHA1 Message Date
Yuan Tang
a79a083e39
Fix broken pgvector provider and memory leaks (#947)
This PR fixes the broken pgvector provider as well as wraps all cursor
object creations with context manager to ensure that they get properly
closed to avoid potential memory leaks.

```
> pytest llama_stack/providers/tests/vector_io/test_vector_io.py   -m "pgvector" --env EMBEDDING_DIMENSION=384 --env PGVECTOR_PORT=7432 --env PGVECTOR_DB=db --env PGVECTOR_USER=user --env PGVECTOR_PASSWORD=pass   -v -s --tb=short --disable-warnings

llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_banks_list[-pgvector] PASSED
llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_banks_register[-pgvector] PASSED
llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_query_documents[-pgvector] The scores are: [0.8168284974053789, 0.8080469278964486, 0.8050996198466661]
PASSED
```

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-05 09:32:05 -08:00
Ihar Hrachyshka
5c8e35a9e2
docs, tests: replace datasets.rst with memory_optimizations.rst (#968)
datasets.rst was removed from torchtune repo.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

# What does this PR do?

Replace a missing 404 document with another one that exists. (Removed it
from
the list when memory_optimizations.rst was already pulled.)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-02-05 11:25:56 -05:00
Ihar Hrachyshka
529708215c
[docs] Make RAG example self-contained (#962)
Before the patch, the example could not be executed verbatim without
copy-pasting client function from the inference example. I think it's
better to have examples self-contained, especially in a getting started
guide.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

# What does this PR do?

See above.

## Test Plan

Confirmed example can now be executed verbatim.

## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-02-04 16:22:50 -08:00
Ashwin Bharambe
474c4bdd7a
Make a couple properties optional (#963) 2025-02-04 16:20:24 -08:00
Ihar Hrachyshka
0cbb3e401c
docs: miscellaneous small fixes (#961)
- **[docs] Fix misc typos and formatting issues in intro docs**
- **[docs]: Export variables (e.g. INFERENCE_MODEL) in getting_started**
- **[docs] Show that `llama-stack-client configure` will ask for api
key**

# What does this PR do?

Miscellaneous fixes in the documentation; not worth reporting an issue.

## Test Plan

No code changes. Addressed issues spotted when walking through the
guide.
Confirmed locally.

## Sources

Please link relevant resources if necessary.

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

---------

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-02-04 15:31:30 -08:00
Nathan Weinberg
b84ab6c6b8
github: issue templates automatically apply relevant label (#956)
# What does this PR do?
the `bug` and `enhancement` labels will be automatically applied to bugs
and feature requests that are opened

## Test Plan
N/A

## Sources

https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-02-04 14:44:03 -08:00
Bill Murdock
b0dec797a0
Add Podman instructions to Quick Start (#957)
Podman is a popular alternative to Docker, so it would be nice to make
it clear that it can also be used to deploy the container for the
server. The instructions are a little different because you have to
create the directory (unlike with Docker which makes the directory for
you).

# What does this PR do?

- [ ] Add Podman instructions to Quick Start

## Test Plan

Documentation only.


## Sources

I tried it out and it worked.

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-02-04 14:37:02 -08:00
Ashwin Bharambe
d67401c644 Several documentation fixes and fix link to API reference 2025-02-04 14:00:43 -08:00
Charlie Doern
26aef50bc5
if client.initialize fails, the example should exit (#954)
# What does this PR do?

the example script can gracefully exit if the boolean returned from
initialize is used properly

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-02-04 13:54:21 -08:00
Ashwin Bharambe
981bb52b59 Quote the token properly 2025-02-04 11:44:29 -08:00
Ashwin Bharambe
5005939494 Use a secret again for the workflow 2025-02-04 11:42:47 -08:00
Ashwin Bharambe
7392daddee Try a new webhook 2025-02-04 11:36:54 -08:00
Ashwin Bharambe
2987fb37c3 fixes? 2025-02-04 11:34:27 -08:00
Ashwin Bharambe
766b11f1f8 Debug workflow 2025-02-04 11:09:16 -08:00
Ashwin Bharambe
5233666143 Debug workflow 2025-02-04 11:07:04 -08:00
Ashwin Bharambe
b35930a7e5 rename 2025-02-04 11:02:45 -08:00
Ashwin Bharambe
ea538e4b32 Add a workflow to trigger readthedocs rebuild 2025-02-04 11:02:06 -08:00
Ashwin Bharambe
b17277b06a Fix the OpenAPI HTML 2025-02-04 10:38:49 -08:00
ehhuang
c9ab72fa82
Support sys_prompt behavior in inference (#937)
# What does this PR do?

The current default system prompt for llama3.2 tends to overindex on
tool calling and doesn't work well when the prompt does not require tool
calling.

This PR adds an option to override the default system prompt, and
organizes tool-related configs into a new config object.

- [ ] Addresses issue (#issue)


## Test Plan

python -m unittest
llama_stack.providers.tests.inference.test_prompt_adapter


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/meta-llama/llama-stack/pull/937).
* #938
* __->__ #937
2025-02-03 23:35:16 -08:00
Xi Yan
62cd3c391e notebook point to github as source of truth 2025-02-03 15:08:25 -08:00
Ashwin Bharambe
753a1aa7bc Update colab link to be pointing back to github source 2025-02-03 15:00:21 -08:00
Ashwin Bharambe
aefd5bb619 Test notebook update 2025-02-03 14:59:06 -08:00
Xi Yan
a251566f92
[docs] typescript sdk readme (#946)
# What does this PR do?

- Update readme for typescript SDK. 

## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-02-03 14:30:42 -08:00
Nathan Weinberg
7a72082cdd
fix: formatting for ollama note in Quick Start doc (#945)
# What does this PR do?
Fixes formatting for Ollama note found here:
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html#start-ollama

- [ ] Addresses issue (#issue)

## Test Plan
Ran local docs build as described
[here](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md#building-the-documentation)

## Sources
N/A

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-02-03 14:13:57 -08:00
Ashwin Bharambe
f98efe68c9
Misc fixes (#944)
- Make sure torch + torchvision go together as deps, otherwise bad stuff
happens
- Add a pre-commit for requirements.txt
2025-02-03 14:08:47 -08:00
Nathan Weinberg
0f14378135
fix: broken "core concepts" link in docs website (#940)
# What does this PR do?
The `core concepts` link on [this
page](https://llama-stack.readthedocs.io/en/latest/contributing/new_api_provider.html)
is currently broken - this PR fixes that link

## Test Plan
Ran local docs build as described
[here](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md#building-the-documentation)

## Sources
N/A

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-02-03 13:46:34 -08:00
Nathan Weinberg
1e36721686
fix: broken link in Quick Start doc (#943)
# What does this PR do?
Ollama download link is broken on this page:
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html

## Test Plan
N/A

## Sources
https://ollama.com/docs/installation ==> 404
https://ollama.com/download

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-02-03 13:45:35 -08:00
Nathan Weinberg
fd367e20c8
github: ignore non-hidden python virtual environments (#939)
# What does this PR do?

the llama-stack repo is already ignoring hidden python `.venv/`
directories but not `venv/`

- [ ] Addresses issue (#issue)

## Test Plan
N/A

## Sources
N/A

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [x] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-02-03 11:53:05 -08:00
Yuan Tang
7558678b8c
Fix uv pip install timeout issue for PyTorch (#929)
This fixes the following timeout issue when installing PyTorch via uv.
Also see reference: https://github.com/astral-sh/uv/pull/1694,
https://github.com/astral-sh/uv/issues/1549

```
Installing pip dependencies
Using Python 3.10.16 environment at: /home/yutang/.conda/envs/distribution-myenv
  × Failed to download and build `antlr4-python3-runtime==4.9.3`
  ├─▶ Failed to extract archive
  ├─▶ failed to unpack
  │   `/home/yutang/.cache/uv/sdists-v7/.tmpDWX4iK/antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py`
  ├─▶ failed to unpack
  │   `antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py` into
  │   `/home/yutang/.cache/uv/sdists-v7/.tmpDWX4iK/antlr4-python3-runtime-4.9.3/src/antlr4/ListTokenSource.py`
  ├─▶ error decoding response body
  ├─▶ request or response body error
  ╰─▶ operation timed out
  help: `antlr4-python3-runtime` (v4.9.3) was included because `torchtune`
        (v0.5.0) depends on `omegaconf` (v2.3.0) which depends on
        `antlr4-python3-runtime>=4.9.dev0, <4.10.dev0`
Failed to build target distribution-myenv with return code 1
```

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-03 06:39:35 -08:00
Yuan Tang
e370a77752
Add issue template config with docs and Discord links (#930)
This is similar to what we are doing for other projects, e.g.
https://github.com/argoproj/argo-workflows/tree/main/.github/ISSUE_TEMPLATE

The benefits is to give people more options before submitting a bug
report or feature request on GitHub.

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-03 06:39:00 -08:00
Yuan Tang
83a51c7bfb
Properly close PGVector DB connection during shutdown() (#931)
The connection to the DB was not closed during shutdown.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-02 21:23:13 -08:00
Ashwin Bharambe
ccf0cbb903 Update release pointer 2025-02-02 12:11:57 -08:00
Ashwin Bharambe
1bb74d95ad Delete CI workflows from here since they have moved to llama-stack-ops 2025-02-02 10:22:48 -08:00
Jeff Tang
587753da2f
LocalInferenceImpl update for LS 0.1 (#911)
# What does this PR do?

To work with the updated iOSCalendarAssistantWithLocalInf
[here](https://github.com/meta-llama/llama-stack-apps/compare/ios_local).

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-02-02 09:49:40 -08:00
Ashwin Bharambe
7fdbd5b642 Add NBVAL skips to the getting started notebook 2025-02-02 07:53:07 -08:00
Ashwin Bharambe
dfd6461498 kill old readme 2025-02-02 06:49:01 -08:00
Yuan Tang
34ab7a3b6c
Fix precommit check after moving to ruff (#927)
Lint check in main branch is failing. This fixes the lint check after we
moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We
need to move to a `ruff.toml` file as well as fixing and ignoring some
additional checks.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-02 06:46:45 -08:00
Yuan Tang
4773092dd1
Fix UBI9 image build when installing Python packages via uv (#926)
This was missed in https://github.com/meta-llama/llama-stack/pull/921. 

cc @ashwinb @hardikjshah

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-01 19:14:29 -08:00
github-actions[bot]
3b8d6578d0 Bump version to 0.1.1 2025-02-02 02:16:26 +00:00
Ashwin Bharambe
75abe48cd0 completions can randomly blurt out something else 2025-02-01 16:01:21 -08:00
Ashwin Bharambe
b03e093e80 Add a COPY option for copying source files into docker 2025-02-01 15:35:38 -08:00
Ashwin Bharambe
942e8b96ac Fix uv pip uninstall 2025-02-01 11:42:28 -08:00
Matthew Farrellee
e21c8b6d80
add image support to NVIDIA inference provider (#907)
# What does this PR do?

add support to the NVIDIA Inference provider for image inputs


## Test Plan

1. Run local [Llama 3.2 11b vision
instruct](https://build.nvidia.com/meta/llama-3.2-11b-vision-instruct?snippet_tab=Docker)
NIM
2. Start a stack, e.g. `llama stack run
llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=http://localhost:8000`
3. Run image tests, e.g. `LLAMA_STACK_BASE_URL=http://localhost:8321
pytest -v tests/client-sdk/inference/test_inference.py
--vision-inference-model meta-llama/Llama-3.2-11B-Vision-Instruct -k
image`


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
2025-02-01 09:02:27 -08:00
Ashwin Bharambe
439d0da84c More pyproject shenanigans 2025-02-01 08:51:45 -08:00
Ashwin Bharambe
1ac0d8306b Remove test parameterization for safety tests, too much noise 2025-02-01 08:38:44 -08:00
Ashwin Bharambe
8f9ff545a4 Update LICENSE format 2025-02-01 08:34:25 -08:00
Ashwin Bharambe
3af9be744d Make package finding automatic 2025-02-01 08:09:39 -08:00
Ashwin Bharambe
5836ab2454 Add uv.lock 2025-01-31 22:40:53 -08:00
Ashwin Bharambe
6344b2429b Kill requirements.txt 2025-01-31 22:38:58 -08:00
Ashwin Bharambe
5b1e69e58e
Use uv pip install instead of pip install (#921)
## What does this PR do? 

See issue: #747 -- `uv` is just plain better. This PR does the bare
minimum of replacing `pip install` by `uv pip install` and ensuring `uv`
exists in the environment.

## Test Plan 

First: create new conda, `uv pip install -e .` on `llama-stack` -- all
is good.
Next: run `llama stack build --template together` followed by `llama
stack run together` -- all good
Next: run `llama stack build --template together --image-name yoyo`
followed by `llama stack run together --image-name yoyo` -- all good
Next: fresh conda and `uv pip install -e .` and `llama stack build
--template together --image-type venv` -- all good.

Docker: `llama stack build --template together --image-type container`
works!
2025-01-31 22:29:41 -08:00