Commit graph

1079 commits

Author SHA1 Message Date
Matthew Farrellee
1a5c17a92f
align with CompletionResponseStreamChunk.delta as str (instead of TextDelta) (#900)
# What does this PR do?

fix type mismatch in /v1/inference/completion

## Test Plan

`llama stack run ./llama_stack/templates/nvidia/run.yaml`

`LLAMA_STACK_BASE_URL="http://localhost:8321" pytest -v
tests/client-sdk/inference/test_inference.py`

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-29 09:25:50 -08:00
Ashwin Bharambe
9f709387e2 Kill X-LlamaStack-{Client-Version, Provider-Data} from OpenAPI spec
ClientVersion: We don't need each SDK method to support this parameter
because you wouldn't be passing a different client version each time you
make an API call.

ProviderData: although in this case, you _could_ be passing different
API keys depending on which SDK call you make, it makes for a confusing
experience. It is best to initialize the LlamaStackClient with all the
keys which are then passed in each request.
2025-01-28 13:30:23 -08:00
Ashwin Bharambe
f2feb7d15c
Fix Chroma adapter (#893)
Chroma method had the wrong signature.

## Test Plan

Start Chroma: `chroma run --path /tmp/foo/chroma2 --host localhost
--port 6001`

Modify run.yaml to include Chroma server pointing to localhost:6001 and
run `llama stack run`

Then:

```bash
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -s -v agents/test_agents.py -k rag
```

passes
2025-01-28 13:19:47 -08:00
Ashwin Bharambe
ec3ebb5bcf
Use ruamel.yaml to format the OpenAPI spec (#892)
Stainless ends up reformatting the YAML when we paste it in the Studio.
We cannot have that happen if we are going to ever partially automate
stainless config updates.

Try ruamel.yaml, specifically `block_seq_indent` to avoid that.
2025-01-28 11:27:40 -08:00
Ashwin Bharambe
41749944a5 Fix ResponseFormat import 2025-01-28 09:34:05 -08:00
Ashwin Bharambe
aee6237685 Small refactor for run_with_pty 2025-01-28 09:32:33 -08:00
Vladislav Bronzov
8332ea23ad
Add run win command for stack (#890)
# What does this PR do?

Add win platform run command for stack

- [x] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.
https://github.com/meta-llama/llama-stack/pull/889


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 08:04:28 -08:00
Vladislav Bronzov
09299e908e
Add windows support for build execution (#889)
# What does this PR do?

This PR implements windows platform support for build_container.sh
execution from terminal. Additionally, it resolves "no support for
Terminos and PTY for Window PC" issues.

- [x] Addresses issue (#issue)
Releates issues: https://github.com/meta-llama/llama-stack/issues/826,
https://github.com/meta-llama/llama-stack/issues/726

## Test Plan

Changes were tested manually by executing standard scripts from LLama
guide:
- llama stack build --template ollama --image-type container
- llama stack build --list-templates
- llama stack build

## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 07:41:41 -08:00
Ashwin Bharambe
d123e9d3d7 Update docs for RAG and improve CONTRIBUTING.md 2025-01-28 06:09:48 -08:00
Zhonglin Han
229f0d5f7c
Agent response format (#660)
# What does this PR do?

Add response format for agents structured output.

- [ ] Using structured output for agents (interior_design app as an
example) (#issue)
https://github.com/meta-llama/llama-stack-apps/issues/122


## Test Plan
E2E test plan with llama-stack-apps interior_design

Please describe:
 Test ran: 

 - provide instructions so it can be reproduced.
 Start your distro:
llama stack run llama_stack/templates/fireworks/run.yaml --env
FIREWORKS_API_KEY=<API_KEY>
 
Run api test:
```PYTHONPATH=. python examples/interior_design_assistant/api.py localhost 5000 examples/interior_design_assistant/resources/documents/ examples/interior_design_assistant/resources/images/fireplaces```


## Sources
Results: 
https://github.com/meta-llama/llama-stack-client-python/pull/72

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 05:05:38 -08:00
Justin Lee
e4865c3510
adding readme to docs folder for easier discoverability of notebooks … (#857)
as titled

<img width="454" alt="image"
src="https://github.com/user-attachments/assets/7579d1d2-06cd-48e4-9659-79ab1ec6a4c2"
/>
2025-01-28 04:58:46 -08:00
Sixian Yi
ba453c3487
Report generation minor fixes (#884)
# What does this PR do?

fixed report generation:
1) do not initialize a new client in report.py - instead get it from
pytest fixture
2) Add "provider" for "safety" and "agents" section
3) add logprobs functionality in "inference" section


## Test Plan

See the regenerated report 



## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 04:58:12 -08:00
Chris Khanoyan
5b0d778871
Update index.md (#888)
Fixing the bullets

# What does this PR do?

The bullets were not there as intended so I helped fix them. 

- [x] Addresses issue (#issue)

## Test Plan

Please describe:

Ran the test, and the bullets are there now to be consistent with the
page.

## Sources

N/A

## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-28 04:55:41 -08:00
snova-edwardm
aa65610e75
Sambanova - LlamaGuard (#886)
# What does this PR do?

- Fix loading SambaNovaImpl issue
- Add LlamaGuard model support for inference

## Test Plan

Run the following unit test scripts and results

### Embedding
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```
```
llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models)
llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_batch_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models)

=================================================================================================================== 2 skipped, 1 warning in 0.32s ===================================================================================================================
```

### Vision
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_vision_inference.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```

```
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image0-expected_strings0] PASSED
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image1-expected_strings1] PASSED
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_streaming[-sambanova] PASSED

=================================================================================================================== 3 passed, 1 warning in 2.68s ====================================================================================================================
```

### Text
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```

```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-sambanova] PASSED

=================================================================================================================== 1 passed, 1 warning in 0.46s ====================================================================================================================
```

```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```

```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-sambanova] PASSED

=================================================================================================================== 1 passed, 1 warning in 0.48s ====================================================================================================================
```




## Before submitting

- [] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [Y] Updated relevant documentation.
- [Y] Wrote necessary unit or integration tests.
2025-01-27 15:46:30 -08:00
Dinesh Yeduguru
3c1a2c3d66
Fix telemetry init (#885)
# What does this PR do?

When you re-initialize the library client in a notebook, we were seeing
this error:
```
Getting traces for session_id=5c8d1969-0957-49d2-b852-32cbb8ef8caf
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-11-d74bb6cdd3ab>](https://localhost:8080/#) in <cell line: 0>()
      7 agent_logs = []
      8 
----> 9 for span in client.telemetry.query_spans(
     10     attribute_filters=[
     11         {"key": "session_id", "op": "eq", "value": session_id},

10 frames
[/usr/local/lib/python3.11/dist-packages/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py](https://localhost:8080/#) in query_traces(self, attribute_filters, limit, offset, order_by)
    246     ) -> QueryTracesResponse:
    247         return QueryTracesResponse(
--> 248             data=await self.trace_store.query_traces(
    249                 attribute_filters=attribute_filters,
    250                 limit=limit,

AttributeError: 'TelemetryAdapter' object has no attribute 'trace_store'
```

This is happening because the we were skipping some required steps for
the object state as part of the global _TRACE_PROVIDER check. This PR
moves the initialization of the object state out of the TRACE_PROVIDER
init.
2025-01-27 11:20:28 -08:00
Ashwin Bharambe
e5936a8df8
Update discriminator to have the correct mapping (#881)
See
https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/#discriminator

When specifying discriminators, mapping must be specified unless the
value of the discriminator is the subtype itself (which in our case is
not.)

The changes in the YAML are self-explanatory.
2025-01-27 09:18:13 -08:00
Ashwin Bharambe
a6d20e0f53
Update documentation (#865)
Update docs variously
2025-01-27 09:17:51 -08:00
Ashwin Bharambe
891bf704eb
Ensure llama stack build --config <> --image-type <> works (#879)
Fix the issues brought up in
https://github.com/meta-llama/llama-stack/issues/870

Test all combinations of (conda, container) vs. (template, config)
combos.
2025-01-25 11:13:36 -08:00
Bakunga Bronson
7de46e40f9
Fixed multiple typos (#878)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-24 14:45:43 -08:00
Bakunga Bronson
33113139e8
Fixed typo (#877)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-24 13:16:00 -08:00
Hardik Shah
632e60439a
Fix report generation for url endpoints (#876)
Earlier, we would have some unknown magic to identify the path for
remote endpoints when testing client_sdk/tests.
Removed that and now you have to explicitly pass a path
2025-01-24 13:15:44 -08:00
Ashwin Bharambe
087a83f673 Bump key for faiss 2025-01-24 12:08:36 -08:00
Ashwin Bharambe
d111bad2f2
Update GH action so it correctly queries for test.pypi, etc. (#875)
The previous curl command was wrong and did not actually check for
version correctly (status code was always 200 regardless of what you
retrieved.)

Also added tagging latest. cc @wukaixingxp
2025-01-24 11:56:29 -08:00
Hardik Shah
2cebb24d3a
Update doc templates for running safety on self-hosted templates (#874) 2025-01-24 11:28:20 -08:00
Ashwin Bharambe
eaba6a550a Point to 0.1.0 release notes in docs 2025-01-24 10:00:16 -08:00
Ashwin Bharambe
05d73dd4fd Bump version to 0.1.0 2025-01-24 09:50:07 -08:00
Ashwin Bharambe
19521cb22e More doc updates 2025-01-24 09:22:15 -08:00
Ashwin Bharambe
2118f37350 Doc updates 2025-01-23 21:31:18 -08:00
Ashwin Bharambe
9351a4b2d7 Update documentation 2025-01-23 17:10:57 -08:00
ehhuang
2fefe8dacd
Update 'first RAG agent' in gettingstarted doc (#867)
# What does this PR do?

Fix documentation to reflect new API


## Test Plan
Before:

User> What are the top 5 topics that were explained? Only list succinct
bullet points.
inference> I'm ready to help, but we haven't discussed any topics yet!
This is the start of our conversation. What would you like to talk
about? I can summarize our discussion at the end if you'd like.


Run with the change, observe relevant response

<img width="1029" alt="image"
src="https://github.com/user-attachments/assets/a7dece3c-e8b4-4a60-9092-ba544c87dffd"
/>



## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

Co-authored-by: Eric Huang (AI Platform) <erichuang@fb.com>
2025-01-23 17:02:04 -08:00
Dinesh Yeduguru
cb11336886
remove logger handler only in notebook (#868)
remove logger handler only in notebook
2025-01-23 16:58:17 -08:00
Dinesh Yeduguru
ebffa15f40
update python sdk reference (#866)
# What does this PR do?

syncs changes from
https://github.com/stainless-sdks/llama-stack-python/blob/main/api.md
2025-01-23 16:04:06 -08:00
Dinesh Yeduguru
c570a708bf
update the client reference (#864)
# What does this PR do?

Syncs changes from
https://github.com/meta-llama/llama-stack-client-python/pull/96
2025-01-23 15:32:16 -08:00
Dinesh Yeduguru
a78f1fc70d
make default tool prompt format none in agent config (#863)
# What does this PR do?

Previously the tests hard coded the tool prompt format to be json which
will cause it to fail when using 3.2/3.3 family of models. This change
make the default to be none for the agent config and just remove the
specification in the tests.


## Test Plan
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v
tests/client-sdk/agents/test_agents.py
2025-01-23 14:44:59 -08:00
Hardik Shah
94ffaf468c
More updates to ReadTheDocs (#861)
Improve Contributing section
2025-01-23 12:50:38 -08:00
Dinesh Yeduguru
7df40da5fa
sync readme.md to index.md (#860)
# What does this PR do?

README has some new content that is being synced to index.md
2025-01-23 12:43:09 -08:00
Hardik Shah
a6a4270eef
Updates to ReadTheDocs (#859)
Move evals section to AI Agents section 
drop from top level and other minor fixes
2025-01-23 12:42:15 -08:00
Ashwin Bharambe
d78027f3b5 Move runpod provider to the correct directory
Also cleanup the test code to avoid skipping tests. Let failures be
known and public.
2025-01-23 12:25:12 -08:00
snova-edwardm
22dc684da6
Sambanova inference provider (#555)
# What does this PR do?

This PR adds SambaNova as one of the Provider

- Add SambaNova as a provider

## Test Plan
Test the functional command
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py llama_stack/providers/tests/inference/test_prompt_adapter.py llama_stack/providers/tests/inference/test_text_inference.py llama_stack/providers/tests/inference/test_vision_inference.py --env SAMBANOVA_API_KEY=<sambanova-api-key>
```

Test the distribution template:
```
# Docker
LLAMA_STACK_PORT=5001
docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-sambanova \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY

# Conda
llama stack build --template sambanova --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env SAMBANOVA_API_KEY=$SAMBANOVA_API_KEY
```

## Source
[SambaNova API Documentation](https://cloud.sambanova.ai/apis)

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [Y] Updated relevant documentation.
- [Y ] Wrote necessary unit or integration tests.

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-01-23 12:20:28 -08:00
Marut Pandya
e2b5456e48
Add Runpod Provider + Distribution (#362)
Add Runpod as a inference provider for openAI compatible managed
endpoints.

Testing 
- Configured llama stack from scratch, set `remote::runpod` as a
inference provider.
- Added Runpod Endpoint URL and API key. 
- Started llama-stack server - llama stack run my-local-stack --port
3000
```
curl http://localhost:5000/inference/chat_completion \
-H "Content-Type: application/json" \
-d '{
	"model": "Llama3.1-8B-Instruct",
	"messages": [
		{"role": "system", "content": "You are a helpful assistant."},
		{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
	],
	"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
}' ```

---------

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
2025-01-23 12:19:02 -08:00
Dinesh Yeduguru
86466b71a9
update docs for adding new API providers (#855)
# What does this PR do?

update docs for adding new API providers
![Screenshot 2025-01-23 at 11 21
42 AM](https://github.com/user-attachments/assets/0d4621d4-ef7e-43cd-9c4a-3e8e0b49242f)
2025-01-23 12:05:57 -08:00
Dinesh Yeduguru
d0be9288a3
Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854)
Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb
2025-01-23 12:04:06 -08:00
Hardik Shah
a10cdc7cdb
Update README.md 2025-01-23 12:00:01 -08:00
Hardik Shah
74e933cbfd
More Updates to Read the Docs (#856) 2025-01-23 11:39:33 -08:00
Dinesh Yeduguru
8a686270e9
remove getting started notebook (#853)
# What does this PR do?

This notebook is no longer updated and we should be using
https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Building_AI_Applications.ipynb
2025-01-23 10:09:09 -08:00
Hardik Shah
25a70ca4dc
Fixed distro documentation (#852)
More docs
2025-01-23 08:19:51 -08:00
raghotham
e44a1a68f1
Delete docs/to_situate directory (#851)
# What does this PR do?

No need for the cookbook now. Removing the folder

- [ ] Addresses issue (#issue)


## Test Plan

Please describe:
 - tests you ran to verify your changes with result summaries.
 - provide instructions so it can be reproduced.


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-23 07:15:47 -08:00
Sixian Yi
bfbd773b54 remove test report 2025-01-23 01:06:39 -08:00
Sixian Yi
82a28f3a24
update doc for client-sdk testing (#849)
As title


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-23 00:17:16 -08:00
Ashwin Bharambe
3d14a3d46f Kill colons 2025-01-22 22:59:30 -08:00