llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-27 18:50:41 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	76dcf47320	docs(mcp): add a few lines for how to specify Auth headers in MCP tools (#2336 )	2025-06-02 14:28:38 -07:00
Francisco Arceo	f328436831	feat: Enable ingestion of precomputed embeddings (#2317 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / test-matrix (http, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (http, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, agents) (push) Failing after 10s Details Integration Tests / test-matrix (http, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (http, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, inference) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Integration Tests / test-matrix (library, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, post_training) (push) Failing after 10s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Unit Tests / unit-tests (3.10) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Integration Tests / test-matrix (library, tool_runtime) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 1m15s Details	2025-05-31 04:03:37 -06:00
Francisco Arceo	8e7ab146f8	feat: Adding support for customizing chunk context in RAG insertion and querying (#2134 ) # What does this PR do? his PR allows users to customize the template used for chunks when inserted into the context. Additionally, this enables metadata injection into the context of an LLM for RAG. This makes a naive and crude assumption that each chunk should include the metadata, this is obviously redundant when multiple chunks are returned from the same document. In order to remove any sort of duplication of chunks, we'd have to make much more significant changes so this is a reasonable first step that unblocks users requesting this enhancement in https://github.com/meta-llama/llama-stack/issues/1767. In the future, this can be extended to support citations. List of Changes: - `llama_stack/apis/tools/rag_tool.py` - Added `chunk_template` field in `RAGQueryConfig`. - Added `field_validator` to validate the `chunk_template` field in `RAGQueryConfig`. - Ensured the `chunk_template` field includes placeholders `{index}` and `{chunk.content}`. - Updated the `query` method to use the `chunk_template` for formatting chunk text content. - `llama_stack/providers/inline/tool_runtime/rag/memory.py` - Modified the `insert` method to pass `doc.metadata` for chunk creation. - Enhanced the `query` method to format results using `chunk_template` and exclude unnecessary metadata fields like `token_count`. - `llama_stack/providers/utils/memory/vector_store.py` - Updated `make_overlapped_chunks` to include metadata serialization and token count for both content and metadata. - Added error handling for metadata serialization issues. - `pyproject.toml` - Added `pydantic.field_validator` as a recognized `classmethod` decorator in the linting configuration. - `tests/integration/tool_runtime/test_rag_tool.py` - Refactored test assertions to separate `assert_valid_chunk_response` and `assert_valid_text_response`. - Added integration tests to validate `chunk_template` functionality with and without metadata inclusion. - Included a test case to ensure `chunk_template` validation errors are raised appropriately. - `tests/unit/rag/test_vector_store.py` - Added unit tests for `make_overlapped_chunks`, verifying chunk creation with overlapping tokens and metadata integrity. - Added tests to handle metadata serialization errors, ensuring proper exception handling. - `docs/_static/llama-stack-spec.html` - Added a new `chunk_template` field of type `string` with a default template for formatting retrieved chunks in RAGQueryConfig. - Updated the `required` fields to include `chunk_template`. - `docs/_static/llama-stack-spec.yaml` - Introduced `chunk_template` field with a default value for RAGQueryConfig. - Updated the required configuration list to include `chunk_template`. - `docs/source/building_applications/rag.md` - Documented the `chunk_template` configuration, explaining how to customize metadata formatting in RAG queries. - Added examples demonstrating the usage of the `chunk_template` field in RAG tool queries. - Highlighted default values for `RAG` agent configurations. # Resolves https://github.com/meta-llama/llama-stack/issues/1767 ## Test Plan Updated both `test_vector_store.py` and `test_rag_tool.py` and tested end-to-end with a script. I also tested the quickstart to enable this and specified this metadata: ```python document = RAGDocument( document_id="document_1", content=source, mime_type="text/html", metadata={"author": "Paul Graham", "title": "How to do great work"}, ) ``` Which produced the output below: ![Screenshot 2025-05-13 at 10 53 43 PM](https://github.com/user-attachments/assets/bb199d04-501e-4217-9c44-4699d43d5519) This highlights the usefulness of the additional metadata. Notice how the metadata is redundant for different chunks of the same document. I think we can update that in a subsequent PR. # Documentation I've added a brief comment about this in the documentation to outline this to users and updated the API documentation. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-05-14 21:56:20 -04:00
Sébastien Han	43e623eea6	chore: remove last instances of code-interpreter provider (#2143 ) Was removed in https://github.com/meta-llama/llama-stack/pull/2087 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-05-12 10:54:43 -07:00
Ashwin Bharambe	272d3359ee	fix: remove code interpeter implementation (#2087 ) # What does this PR do? The builtin implementation of code interpreter is not robust and has a really weak sandboxing shell (the `bubblewrap` container). Given the availability of better MCP code interpreter servers coming up, we should use them instead of baking an implementation into the Stack and expanding the vulnerability surface to the rest of the Stack. This PR only does the removal. We will add examples with how to integrate with MCPs in subsequent ones. ## Test Plan Existing tests.	2025-05-01 14:35:08 -07:00
Kevin Postlethwait	e0fa67c81c	docs: add examples for how to define RAG docs (#1981 ) # What does this PR do? Add examples for how to define RAGDocuments. Not sure if this is the best place for these docs. @raghotham Please advise ## Test Plan None, documentation [//]: # (## Documentation) Signed-off-by: Kevin <kpostlet@redhat.com>	2025-04-23 15:39:18 +02:00
Yuan Tang	4c6b7005fa	fix: Fix docs lint issues (#1993 ) # What does this PR do? This was not caught as part of the CI build: `dd62a2388c`. [This PR](https://github.com/meta-llama/llama-stack/pull/1354) was too old and didn't include the additional CI builds yet. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-04-18 02:33:13 -04:00
AN YU (安宇)	dd62a2388c	docs: add notes to websearch tool and two extra example scripts (#1354 ) # What does this PR do? - Adds a note about unexpected Brave Search output appearing even when Tavily Search is called. This behavior is expected for now and is a work in progress https://github.com/meta-llama/llama-stack/issues/1229. The note aims to clear any confusion for new users. - Adds two example scripts demonstrating how to build an agent using: 1. WebSearch tool 2. WolframAlpha tool These examples provide new users with an instant understanding of how to integrate these tools. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Tested these example scripts using following steps: step 1. `ollama run llama3.2:3b-instruct-fp16 --keepalive 60m` step 2. ``` export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" export LLAMA_STACK_PORT=8321 ``` step 3: `llama stack run --image-type conda ~/llama-stack/llama_stack/templates/ollama/run.yaml` step 4: run the example script with your api keys. expected output: ![image](https://github.com/user-attachments/assets/308ddb17-a087-4cf2-8622-b085174ea0ab) ![image](https://github.com/user-attachments/assets/639f239f-8966-433d-943c-ee6b304c0d71) [//]: # (## Documentation)	2025-04-17 20:20:52 -04:00
Francisco Arceo	23a99a4b22	docs: Minor updates to docs to make them a little friendlier to new users (#1871 ) # What does this PR do? This PR modifies some of the docs to help them map to (1) the mental model of software engineers building AI models starting with RAG and then moving to Agents and (2) aligning the navbar somewhat closer to the diagram on the home page. ## Test Plan N/A Tested locally. # Documentation Take a look at the screen shot for below and after. ## Before ![Screenshot 2025-04-03 at 10 39 32 PM](https://github.com/user-attachments/assets/c4dc9998-3e46-43b0-8425-892c94ec3a6a) ## After ![Screenshot 2025-04-03 at 10 38 37 PM](https://github.com/user-attachments/assets/05670fcd-e56b-42dd-8af2-07b81f941d40) --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-04-04 08:10:35 -04:00
Francisco Arceo	19f504e9e2	docs: Updating docs to source from CONTRIBUTING.md (#1850 ) # What does this PR do? Another for https://github.com/meta-llama/llama-stack/issues/1815 This links the `CONTRIBUTING.md` file directly so that we don't have to maintain two different files. Also I updated the title for RAG under Building AI Applications. ## Changes Look of what the Contributing page looks like, proof it sources directly from the markdown file. ![Screenshot 2025-04-01 at 12 43 51 AM](https://github.com/user-attachments/assets/f7021d29-eec3-44ad-a5b3-55c4480ea9ac) --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-04-01 14:50:04 +02:00
Francisco Arceo	d495922949	docs: Updated documentation and Sphinx configuration (#1845 ) # What does this PR do? The goal of this PR is to make the pages easier to navigate by surfacing the child pages on the navbar, updating some of the copy, moving some of the files around. Some changes: 1. Clarifying Titles 2. Restructuring "Distributions" more formally in its own page to be consistent with Providers and adding some clarity to the child pages to surface them and make them easier to navigate 3. Updated sphinx config to not collapse navigation by default 4. Updated copyright year to be calculated dynamically 5. Moved `docs/source/distributions/index.md` -> `docs/source/distributions/starting_llama_stack_server.md` Another for https://github.com/meta-llama/llama-stack/issues/1815 ## Test Plan Tested locally and pages build (screen shots for example). ## Documentation ### Before: ![Screenshot 2025-03-31 at 1 09 21 PM](https://github.com/user-attachments/assets/98e34f76-f0d9-4055-8e2c-441b1e7d8f6a) ### After: ![Screenshot 2025-03-31 at 1 08 52 PM](https://github.com/user-attachments/assets/dfb6b8ad-3a1d-46b6-8f54-0c553664093f) Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-03-31 13:08:05 -07:00
Anamika	d8a8a734b5	fix: update sink name for traces and metrics in LlamaStack 0.1.8 (#1836 ) # What does this PR do? This PR updates the sink name configuration for traces and metrics in LlamaStack to align with the latest changes introduced in version 0.1.8. Previously, when using the `otel` sink along with other sinks (like `console` and `sqlite`), the system threw a ValueError, with the message: ```shell Value error, 'otel' is not a valid TelemetrySink [type=value_error, input_value='console,otel,sqlite', input_type=str] For further information visit https://errors.pydantic.dev/2.10/v/value_error ``` ## Test Plan - Test 1: Ran the LlamaStack server with a configuration containing `console,otel,sqlite` as sinks. - Expected result: No errors related to invalid sink names. - Result: The system ran without throwing a `ValueError`. - Test 2: Verified that the `otel_trace`, `otel_metric` sink now works in combination with other sinks (`console`, `sqlite`). - Expected result: Telemetry data is correctly sent to all specified sinks without errors. - Result: All telemetry data was successfully sent to the specified sinks.	2025-03-29 10:09:08 -07:00
Hardik Shah	581e8ae562	fix: docker run with `--pull always` to fetch the latest image (#1733 ) As titled	2025-03-20 15:35:48 -07:00
ehhuang	ea6a4a14ce	feat(api): simplify client imports (#1687 ) # What does this PR do? closes #1554 ## Test Plan test_agents.py	2025-03-20 10:15:49 -07:00
ehhuang	b6b103a20d	docs: update for mcp tools (#1705 ) # What does this PR do? ## Test Plan read	2025-03-19 15:45:53 -07:00
Reid	8814111da1	docs: improve eval doc (#1501 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-10 11:38:07 -07:00
ehhuang	fbd47bb4b6	feat(agent): plain function as client tool (#1479 ) Summary: support added in https://github.com/meta-llama/llama-stack-client-python/pull/187 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/integration/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct	2025-03-07 11:10:07 -08:00
ehhuang	ca2910d27a	docs: update test_agents to use new Agent SDK API (#1402 ) # Summary: new Agent SDK API is added in https://github.com/meta-llama/llama-stack-client-python/pull/178 Update docs and test to reflect this. Closes https://github.com/meta-llama/llama-stack/issues/1365 # Test Plan: ```bash py.test -v -s --nbval-lax ./docs/getting_started.ipynb LLAMA_STACK_CONFIG=fireworks \ pytest -s -v tests/integration/agents/test_agents.py \ --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct ```	2025-03-06 15:21:12 -08:00
Xi Yan	564977c646	docs: update eval doc (#1453 ) # What does this PR do? - Update eval doc to reflect latest changes - Closes https://github.com/meta-llama/llama-stack/issues/1441 ## Test Plan read [//]: # (## Documentation)	2025-03-06 14:14:10 -08:00
Reid	db4ee7a9ff	docs: improve rag doc (#1411 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-06 14:03:52 -08:00
Reid	77d323c2f8	docs: fix typo (#1416 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-05 10:02:32 -08:00
Daniele Martinoli	fb998683e0	fix: Agent uses the first configured vector_db_id when documents are provided (#1276 ) # What does this PR do? The agent API allows to query multiple DBs using the `vector_db_ids` argument of the `rag` tool: ```py toolgroups=[ { "name": "builtin::rag", "args": {"vector_db_ids": [vector_db_id]}, } ], ``` This means that multiple DBs can be used to compose an aggregated context by executing the query on each of them. When documents are passed to the next agent turn, there is no explicit way to configure the vector DB where the embeddings will be ingested. In such cases, we can assume that: - if any `vector_db_ids` is given, we use the first one (it probably makes sense to assume that it's the only one in the list, otherwise we should loop on all the given DBs to have a consistent ingestion) - if no `vector_db_ids` is given, we can use the current logic to generate a default DB using the default provider. If multiple providers are defined, the API will fail as expected: the user has to provide details on where to ingest the documents. (Closes #1270) ## Test Plan The issue description details how to replicate the problem. [//]: # (## Documentation) --------- Signed-off-by: Daniele Martinoli <dmartino@redhat.com>	2025-03-04 21:44:13 -08:00
Xi Yan	e9a37bad63	chore: rename task_config to benchmark_config (#1397 ) # What does this PR do? - This was missed from previous deprecation: https://github.com/meta-llama/llama-stack/pull/1186 - Part of https://github.com/meta-llama/llama-stack/issues/1396 [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` pytest -v -s --nbval-lax ./llama-stack/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb ``` [//]: # (## Documentation)	2025-03-04 12:44:04 -08:00
Reid	a9a7b11326	docs: update agent_execution_loop example code (#1350 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] - add missing `import` - add client define - update `attachments` to `documents`, `40da0d0e76` [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-02 18:27:43 -08:00
Reid	e84f1a5549	fix: fix pre-commit check issue (#1349 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] For `3805604220` ``` Fixing docs/source/building_applications/tools.md check for added large files..............................................Passed fix end of files.........................................................Passed Insert license in comments...............................................Passed ruff.....................................................................Passed ruff-format..............................................................Passed blacken-docs.............................................................Passed uv-lock..................................................................Passed uv-export................................................................Passed mypy.....................................................................Passed Distribution Template Codegen............................................Passed pre-commit hook(s) made changes. If you are seeing this message in CI, reproduce locally with: `pre-commit run --all-files`. To run `pre-commit` as part of git workflow, use `pre-commit install`. All changes made by hooks: diff --git a/docs/source/building_applications/tools.md b/docs/source/building_applications/tools.md index afffbc8..5a569ff 100644 --- a/docs/source/building_applications/tools.md +++ b/docs/source/building_applications/tools.md @@ -127,7 +127,7 @@ MCP tools require: ## Adding Custom Tools -When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`. +When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`. To define a custom tool, you need to use the `@client_tool` decorator. ```python Error: Process completed with exit code 1. ``` [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: reidliu <reid201711@gmail.com>	2025-03-02 11:13:17 -05:00
ehhuang	52977e56a8	docs: update Agent documentation (#1333 ) Summary: - [new] Agent concepts (session, turn) - [new] how to write custom tools - [new] non-streaming API and how to get outputs - [update] remaining `memory` -> `rag` rename - [new] note importance of `instructions` Test Plan: read	2025-03-01 22:34:52 -08:00
ehhuang	c8a20b8ed0	feat: allow specifying specific tool within toolgroup (#1239 ) Summary: E.g. `builtin::rag::knowledge_search` Test Plan: ``` LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/ --safety-shield meta-llama/Llama-Guard-3-8B ```	2025-02-26 14:07:05 -08:00
Kevin Cogan	561295af76	docs: Fix Links, Add Podman Instructions, Vector DB Unregister, and Example Script (#1129 ) # What does this PR do? This PR improves the documentation in several ways: - Fixed incorrect link in `tools.md` to ensure all references point to the correct resources. - Added instructions for running the `code-interpreter` agent in a Podman container, helping users configure and execute the tool in containerized environments. - Introduced an unregister command for single and multiple vector databases, making it easier to manage vector DBs. - Provided a simple example script for using the `code-interpreter` agent, giving users a practical reference for implementation. These updates enhance the clarity, usability, and completeness of the documentation. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan The following steps were performed to verify the accuracy of the changes: 1. Validated all fixed link by checking their destinations to ensure correctness. 2. Ran the `code-interpreter` agent in a Podman container following the new instructions to confirm functionality. 3. Executed the vector database unregister commands and verified that both single and multiple databases were correctly removed. 4. Tested the new example script for `code-interpreter`, ensuring it runs without errors. All changes were reviewed and tested successfully, improving the documentation's accuracy and ease of use. [//]: # (## Documentation)	2025-02-20 13:52:14 -08:00
Xi Yan	8b655e3cd2	fix!: update eval-tasks -> benchmarks (#1032 ) # What does this PR do? - Update `/eval-tasks` to `/benchmarks` - ⚠️ Remove differentiation between `app` v.s. `benchmark` eval task config. Now we only have `BenchmarkConfig`. The overloaded `benchmark` is confusing and do not add any value. Backward compatibility is being kept as the "type" is not being used anywhere. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan - This change is backward compatible - Run notebook test with ``` pytest -v -s --nbval-lax ./docs/getting_started.ipynb pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb ``` <img width="846" alt="image" src="https://github.com/user-attachments/assets/d2fc06a7-593a-444f-bc1f-10ab9b0c843d" /> [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant) --------- Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> Signed-off-by: Ben Browning <bbrownin@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> Co-authored-by: Ben Browning <ben324@gmail.com> Co-authored-by: Sébastien Han <seb@redhat.com> Co-authored-by: Reid <61492567+reidliu41@users.noreply.github.com> Co-authored-by: reidliu <reid201711@gmail.com> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-13 16:40:58 -08:00
Michael Clifford	076213165c	docs: update rag.md example code to prevent errors (#1009 )	2025-02-10 09:25:30 -05:00
ehhuang	af15426ad7	doc: getting started notebook (#996 ) # What does this PR do? Fix link ## Test Plan <!-- Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. --> <!-- ## Sources Please link relevant resources if necessary. --> <!-- ## Documentation - [ ] Added a [Changelog](https://github.com/meta-llama/llama-stack/blob/main/CHANGELOG.md) entry if the change is significant (new feature, breaking change etc.). -->	2025-02-06 17:30:21 -08:00
Yuan Tang	34ab7a3b6c	Fix precommit check after moving to ruff (#927 ) Lint check in main branch is failing. This fixes the lint check after we moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We need to move to a `ruff.toml` file as well as fixing and ignoring some additional checks. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-02 06:46:45 -08:00
Ashwin Bharambe	d123e9d3d7	Update docs for RAG and improve CONTRIBUTING.md	2025-01-28 06:09:48 -08:00
Bakunga Bronson	7de46e40f9	Fixed multiple typos (#878 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-24 14:45:43 -08:00
Ashwin Bharambe	19521cb22e	More doc updates	2025-01-24 09:22:15 -08:00
Ashwin Bharambe	2118f37350	Doc updates	2025-01-23 21:31:18 -08:00
Hardik Shah	a6a4270eef	Updates to ReadTheDocs (#859 ) Move evals section to AI Agents section drop from top level and other minor fixes	2025-01-23 12:42:15 -08:00
Dinesh Yeduguru	d0be9288a3	Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854 ) Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb	2025-01-23 12:04:06 -08:00
Hardik Shah	74e933cbfd	More Updates to Read the Docs (#856 )	2025-01-23 11:39:33 -08:00
Dinesh Yeduguru	28012c51bb	update docs for tools and telemetry (#846 ) # What does this PR do? Added a new Tools doc describing how to use tools and updated the main building agents doc to point to the tools doc. Also updated telemetry doc. https://llama-stack.readthedocs.io/en/tools-doc/building_applications/tools.html	2025-01-22 22:50:29 -08:00
Ashwin Bharambe	03ac84a829	Update default port from 5000 -> 8321	2025-01-16 15:26:48 -08:00
Hardik Shah	a51c8b4efc	Convert `SamplingParams.strategy` to a union (#767 ) # What does this PR do? Cleans up how we provide sampling params. Earlier, strategy was an enum and all params (top_p, temperature, top_k) across all strategies were grouped. We now have a strategy union object with each strategy (greedy, top_p, top_k) having its corresponding params. Earlier, ``` class SamplingParams: strategy: enum () top_p, temperature, top_k and other params ``` However, the `strategy` field was not being used in any providers making it confusing to know the exact sampling behavior purely based on the params since you could pass temperature, top_p, top_k and how the provider would interpret those would not be clear. Hence we introduced -- a union where the strategy and relevant params are all clubbed together to avoid this confusion. Have updated all providers, tests, notebooks, readme and otehr places where sampling params was being used to use the new format. ## Test Plan `pytest llama_stack/providers/tests/inference/groq/test_groq_utils.py` // inference on ollama, fireworks and together `with-proxy pytest -v -s -k "ollama" --inference-model="meta-llama/Llama-3.1-8B-Instruct" llama_stack/providers/tests/inference/test_text_inference.py ` // agents on fireworks `pytest -v -s -k 'fireworks and create_agent' --inference-model="meta-llama/Llama-3.1-8B-Instruct" llama_stack/providers/tests/agents/test_agents.py --safety-shield="meta-llama/Llama-Guard-3-8B"` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [X] Ran pre-commit to handle lint / formatting issues. - [X] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [X] Updated relevant documentation. - [X] Wrote necessary unit or integration tests. --------- Co-authored-by: Hardik Shah <hjshah@fb.com>	2025-01-15 05:38:51 -08:00
Ashwin Bharambe	2e5bfcd42a	Update Telemetry API so OpenAPI generation can work (#640 ) We cannot use recursive types because not only does our OpenAPI generator not like them, even if it did, it is not easy for all client languages to automatically construct proper APIs (especially considering garbage collection) around them. For now, we can return a `Dict[str, SpanWithStatus]` instead of `SpanWithChildren` and rely on the client to reconstruct the tree. Also fixed a super subtle issue with the OpenAPI generation process (monkey-patching of json_schema_type wasn't working because of import reordering.)	2024-12-16 13:00:14 -08:00
Xi Yan	815f4af6cf	add colab notebook & update docs (#619 ) # What does this PR do? - add notebooks - restructure docs ## Test Plan <img width="1201" alt="image" src="https://github.com/user-attachments/assets/3f9a09d9-b5ec-406c-b44b-e896e340d209" /> <img width="1202" alt="image" src="https://github.com/user-attachments/assets/fdc1173f-2417-4ad6-845e-4f265fc40a31" /> <img width="1201" alt="image" src="https://github.com/user-attachments/assets/b1e4e2a8-acf6-4ef2-a2fc-00d26cf32359" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-13 19:15:15 -08:00
Yuri Shkuro	397ee71c14	Fix Jaeger instructions (#580 ) # What does this PR do? - A follow-up for #572 - The command in the original PR did not run - Remove `--set` command unnecessary since Jaeger 2.1.0 ## Test Plan ``` $ docker run --rm --name jaeger \ -p 16686:16686 -p 4318:4318 \ jaegertracing/jaeger:2.1.0 2024/12/07 19:07:13 application version: git-commit=65cff3c30823ea20d3dc48bae39d5685ae307da5, git-version=v2.1.0, build-date=2024-12-06T21:17:15Z ... ``` ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Yuri Shkuro <github@ysh.us>	2024-12-08 15:29:53 -08:00
Ashwin Bharambe	1274fa4c0d	Add documentations for building applications and with some content for agentic loop	2024-12-08 14:56:37 -08:00
Dinesh Yeduguru	cb9e9048e7	add telemetry docs (#572 ) Add an experimental section and telemetry doc ![Screenshot 2024-12-05 at 10 22 51 AM](https://github.com/user-attachments/assets/b8b7a982-b800-4069-a4d0-481fc300b336) --------- Co-authored-by: Adrian Cole <64215+codefromthecrypt@users.noreply.github.com>	2024-12-06 10:17:11 -08:00
Ashwin Bharambe	fc8ace50af	Add stub for Building Applications	2024-11-22 23:05:17 -08:00

48 commits