llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-12 16:16:09 +00:00

Author	SHA1	Message	Date
Jash Gulabrai	40e2c97915	feat: Add Nvidia e2e beginner notebook and tool calling notebook (#1964 ) # What does this PR do? This PR contains two sets of notebooks that serve as reference material for developers getting started with Llama Stack using the NVIDIA Provider. Developers should be able to execute these notebooks end-to-end, pointing to their NeMo Microservices deployment. 1. `beginner_e2e/`: Notebook that walks through a beginner end-to-end workflow that covers creating datasets, running inference, customizing and evaluating models, and running safety checks. 2. `tool_calling/`: Notebook that is ported over from the [Data Flywheel & Tool Calling notebook](https://github.com/NVIDIA/GenerativeAIExamples/tree/main/nemo/data-flywheel) that is referenced in the NeMo Microservices docs. I updated the notebook to use the Llama Stack client wherever possible, and added relevant instructions. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan - Both notebook folders contain READMEs with pre-requisites. To manually test these notebooks, you'll need to have a deployment of the NeMo Microservices Platform and update the `config.py` file with your deployment's information. - I've run through these notebooks manually end-to-end to verify each step works. [//]: # (## Documentation) --------- Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-06-16 11:29:01 -04:00
Ihar Hrachyshka	9e6561a1ec	chore: enable pyupgrade fixes (#1806 ) # What does this PR do? The goal of this PR is code base modernization. Schema reflection code needed a minor adjustment to handle UnionTypes and collections.abc.AsyncIterator. (Both are preferred for latest Python releases.) Note to reviewers: almost all changes here are automatically generated by pyupgrade. Some additional unused imports were cleaned up. The only change worth of note can be found under `docs/openapi_generator` and `llama_stack/strong_typing/schema.py` where reflection code was updated to deal with "newer" types. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-01 14:23:50 -07:00
Jash Gulabrai	eab550f7d2	fix: Fix messages format in NVIDIA safety check request body (#2063 ) # What does this PR do? When running a Llama Stack server and invoking the `/v1/safety/run-shield` endpoint, the NVIDIA Guardrails endpoint in some cases errors with a `422: Unprocessable Entity` due to malformed input. For example, given an request body like: ``` { "model": "test", "messages": [ { "role": "user", "content": "You are stupid." } ] } ``` `convert_pydantic_to_json_value` converts the message to: ``` { "role": "user", "content": "You are stupid.", "context": null } ``` Which causes NVIDIA Guardrails to return an error `HTTPError: 422 Client Error: Unprocessable Entity for url: http://nemo.test/v1/guardrail/checks`, because `context` shouldn't be included in the body. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan I ran the Llama Stack server locally and manually verified that the endpoint now succeeds. ``` message = {"role": "user", "content": "You are stupid."} response = client.safety.run_shield(messages=[message], shield_id=shield_id, params={}) ``` Server logs: ``` 14:29:09.656 [START] /v1/safety/run-shield INFO: 127.0.0.1:54616 - "POST /v1/safety/run-shield HTTP/1.1" 200 OK 14:29:09.918 [END] /v1/safety/run-shield [StatusCode.OK] (262.26ms ``` [//]: # (## Documentation) Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-30 18:01:28 +02:00
Jash Gulabrai	2ae1d7f4e6	docs: Add NVIDIA platform distro docs (#1971 ) # What does this PR do? Add NVIDIA platform docs that serve as a starting point for Llama Stack users and explains all supported microservices. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) --------- Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-17 05:54:30 -07:00
Jash Gulabrai	c1cb6aad11	feat: Add unit tests for NVIDIA safety (#1897 ) # What does this PR do? This PR adds unit tests for the NVIDIA Safety provider implementation. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] 1. Ran `./scripts/unit-tests.sh tests/unit/providers/nvidia/test_safety.py` from the root of the project. Verified tests pass. ``` tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_init_nemo_guardrails Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_init_nemo_guardrails_invalid_temperature Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_register_shield_with_valid_id Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_register_shield_without_id Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_run_shield_allowed Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_run_shield_blocked Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_run_shield_http_error Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED tests/unit/providers/nvidia/test_safety.py::TestNVIDIASafetyAdapter::test_run_shield_not_found Initializing NVIDIASafetyAdapter(http://nemo.test)... PASSED ``` [//]: # (## Documentation) --------- Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-11 11:49:55 -07:00
cdgamarose-nv	252a487085	feat: added nvidia as safety provider (#1248 ) # What does this PR do? Adds nvidia as a safety provider by interfacing with the nemo guardrails microservice. This enables checking user’s input or the LLM’s output against input and output guardrails by using the `/v1/guardrails/checks` endpoint of the[ guardrails API.](https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/guides/checks-guide.html) ## Test Plan Deploy nemo guardrails service following the documentation: https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/getting-started/deploy-docker.html ### Standalone: ```bash (venv) local-cdgamarose@a1u1g-rome-0153:~/llama-stack$ pytest -v -s llama_stack/providers/tests/safety/test_safety.py --providers inference=nvidia,safety=nvidia --safety-shield meta/llama-3.1-8b-instruct =================================================================================== test session starts =================================================================================== platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0 -- /localhome/local-cdgamarose/llama-stack/venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.10.12', 'Platform': 'Linux-5.15.0-122-generic-x86_64-with-glibc2.35', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0', 'html': '4.1.1'}} rootdir: /localhome/local-cdgamarose/llama-stack configfile: pyproject.toml plugins: metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0, html-4.1.1 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None collected 2 items llama_stack/providers/tests/safety/test_safety.py::TestSafety::test_shield_list[--inference=nvidia:safety=nvidia] Initializing NVIDIASafetyAdapter(http://0.0.0.0:7331)... PASSED llama_stack/providers/tests/safety/test_safety.py::TestSafety::test_run_shield[--inference=nvidia:safety=nvidia] PASSED ============================================================================== 2 passed, 2 warnings in 4.78s ============================================================================== ``` ### Distribution: ``` llama stack run llama_stack/templates/nvidia/run-with-safety.yaml curl -v -X 'POST' "http://localhost:8321/v1/safety/run-shield" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"shield_id": "meta/llama-3.1-8b-instruct", "messages":[{"role": "user", "content": "you are stupid"}]}' {"violation":{"violation_level":"error","user_message":"Sorry I cannot do this.","metadata":{"self check input":{"status":"blocked"}}}} ``` [//]: # (## Documentation) --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-03-17 14:39:23 -07:00

6 commits