llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2026-01-01 17:40:01 +00:00

Author	SHA1	Message	Date
m-misiura	87d209d6ef	Squashed commit of the following: commit a95d2b15b83057e194cf69e57a03deeeeeadd7c2 Author: m-misiura <mmisiura@redhat.com> Date: Mon Mar 24 14:33:50 2025 +0000 🚧 working on the config file so that it is inheriting from pydantic base models commit 0546379f817e37bca030247b48c72ce84899a766 Author: m-misiura <mmisiura@redhat.com> Date: Mon Mar 24 09:14:31 2025 +0000 🚧 dealing with ruff checks commit 8abe39ee4cb4b8fb77c7252342c4809fa6ddc432 Author: m-misiura <mmisiura@redhat.com> Date: Mon Mar 24 09:03:18 2025 +0000 🚧 dealing with mypy errors in `base.py` commit 045f833e79c9a25af3d46af6c8896da91a0e6e62 Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 17:31:25 2025 +0000 🚧 fixing mypy errors in content.py commit a9c1ee4e92ad1b5db89039317555cd983edbde65 Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 17:09:02 2025 +0000 🚧 fixing mypy errors in chat.py commit 69e8ddc2f8a4e13cecbab30272fd7d685d7864ec Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 16:57:28 2025 +0000 🚧 fixing mypy errors commit 56739d69a145c55335ac2859ecbe5b43d556e3b1 Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 14:01:03 2025 +0000 🚧 fixing mypy errors in `__init__.py` commit 4d2e3b55c4102ed75d997c8189847bbc5524cb2c Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 12:58:06 2025 +0000 🚧 ensuring routing_tables.py do not fail the ci commit c0cc7b4b09ef50d5ec95fdb0a916c7ed228bf366 Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 12:09:24 2025 +0000 🐛 fixing linter problems commit 115a50211b604feb4106275204fe7f863da865f6 Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 11:47:04 2025 +0000 🐛 fixing ruff errors commit 29b5bfaabc77a35ea036b57f75fded711228dbbf Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 11:33:31 2025 +0000 🎨 automatic ruff fixes commit 7c5a334c7d4649c2fc297993f89791c1e5643e5b Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 11:15:02 2025 +0000 Squashed commit of the following: commit e671aae5bcd4ea57d601ee73c9e3adf5e223e830 Merge: b0dd9a4f `9114bef4` Author: Mac Misiura <82826099+m-misiura@users.noreply.github.com> Date: Fri Mar 21 09:45:08 2025 +0000 Merge branch 'meta-llama:main' into feat_fms_remote_safety_provider commit b0dd9a4f746b0c8c54d1189d381a7ff8e51c812c Author: m-misiura <mmisiura@redhat.com> Date: Fri Mar 21 09:27:21 2025 +0000 📝 updated `provider_id` commit 4c8906c1a4e960968b93251d09d5e5735db15026 Author: m-misiura <mmisiura@redhat.com> Date: Thu Mar 20 16:54:46 2025 +0000 📝 renaming from `fms` to `trustyai_fms` commit 4c0b62abc51b02143b5c818f2d30e1a1fee9e4f3 Merge: bb842d69 `54035825` Author: m-misiura <mmisiura@redhat.com> Date: Thu Mar 20 16:35:52 2025 +0000 Merge branch 'main' into feat_fms_remote_safety_provider commit bb842d69548df256927465792e0cd107a267d2a0 Author: m-misiura <mmisiura@redhat.com> Date: Wed Mar 19 15:03:17 2025 +0000 ✨ added a better way of handling params from the configs commit 58b6beabf0994849ac50317ed00b748596e8961d Merge: a22cf36c `7c044845` Author: m-misiura <mmisiura@redhat.com> Date: Wed Mar 19 09:19:57 2025 +0000 Merge main into feat_fms_remote_safety_provider, resolve conflicts by keeping main version commit a22cf36c8757f74ed656c1310a4be6b288bf923a Author: m-misiura <mmisiura@redhat.com> Date: Wed Mar 5 16:17:46 2025 +0000 🎉 added a new remote safety provider compatible with FMS Orchestrator API and Detectors API Signed-off-by: m-misiura <mmisiura@redhat.com>	2025-03-24 14:46:03 +00:00
cdgamarose-nv	252a487085	feat: added nvidia as safety provider (#1248 ) # What does this PR do? Adds nvidia as a safety provider by interfacing with the nemo guardrails microservice. This enables checking user’s input or the LLM’s output against input and output guardrails by using the `/v1/guardrails/checks` endpoint of the[ guardrails API.](https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/guides/checks-guide.html) ## Test Plan Deploy nemo guardrails service following the documentation: https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/getting-started/deploy-docker.html ### Standalone: ```bash (venv) local-cdgamarose@a1u1g-rome-0153:~/llama-stack$ pytest -v -s llama_stack/providers/tests/safety/test_safety.py --providers inference=nvidia,safety=nvidia --safety-shield meta/llama-3.1-8b-instruct =================================================================================== test session starts =================================================================================== platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0 -- /localhome/local-cdgamarose/llama-stack/venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.10.12', 'Platform': 'Linux-5.15.0-122-generic-x86_64-with-glibc2.35', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0', 'html': '4.1.1'}} rootdir: /localhome/local-cdgamarose/llama-stack configfile: pyproject.toml plugins: metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0, html-4.1.1 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None collected 2 items llama_stack/providers/tests/safety/test_safety.py::TestSafety::test_shield_list[--inference=nvidia:safety=nvidia] Initializing NVIDIASafetyAdapter(http://0.0.0.0:7331)... PASSED llama_stack/providers/tests/safety/test_safety.py::TestSafety::test_run_shield[--inference=nvidia:safety=nvidia] PASSED ============================================================================== 2 passed, 2 warnings in 4.78s ============================================================================== ``` ### Distribution: ``` llama stack run llama_stack/templates/nvidia/run-with-safety.yaml curl -v -X 'POST' "http://localhost:8321/v1/safety/run-shield" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"shield_id": "meta/llama-3.1-8b-instruct", "messages":[{"role": "user", "content": "you are stupid"}]}' {"violation":{"violation_level":"error","user_message":"Sorry I cannot do this.","metadata":{"self check input":{"status":"blocked"}}}} ``` [//]: # (## Documentation) --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-03-17 14:39:23 -07:00
Ashwin Bharambe	d072b5fa0c	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
Ashwin Bharambe	46b0a404e8	chore: remove straggler references to llama-models (#1345 ) Straggler references cleanup	2025-03-01 14:26:03 -08:00
Ashwin Bharambe	314ee09ae3	chore: move all Llama Stack types from llama-models to llama-stack (#1098 ) llama-models should have extremely minimal cruft. Its sole purpose should be didactic -- show the simplest implementation of the llama models and document the prompt formats, etc. This PR is the complement to https://github.com/meta-llama/llama-models/pull/279 ## Test Plan Ensure all `llama` CLI `model` sub-commands work: ```bash llama model list llama model download --model-id ... llama model prompt-format -m ... ``` Ran tests: ```bash cd tests/client-sdk LLAMA_STACK_CONFIG=fireworks pytest -s -v inference/ LLAMA_STACK_CONFIG=fireworks pytest -s -v vector_io/ LLAMA_STACK_CONFIG=fireworks pytest -s -v agents/ ``` Create a fresh venv `uv venv && source .venv/bin/activate` and run `llama stack build --template fireworks --image-type venv` followed by `llama stack run together --image-type venv` <-- the server runs Also checked that the OpenAPI generator can run and there is no change in the generated files as a result. ```bash cd docs/openapi_generator sh run_openapi_generator.sh ```	2025-02-14 09:10:59 -08:00
Sébastien Han	e4a1579e63	build: format codebase imports using ruff linter (#1028 ) # What does this PR do? - Configured ruff linter to automatically fix import sorting issues. - Set --exit-non-zero-on-fix to ensure non-zero exit code when fixes are applied. - Enabled the 'I' selection to focus on import-related linting rules. - Ran the linter, and formatted all codebase imports accordingly. - Removed the black dep from the "dev" group since we use ruff Signed-off-by: Sébastien Han <seb@redhat.com> [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant) Signed-off-by: Sébastien Han <seb@redhat.com>	2025-02-13 10:06:21 -08:00
Yuan Tang	34ab7a3b6c	Fix precommit check after moving to ruff (#927 ) Lint check in main branch is failing. This fixes the lint check after we moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We need to move to a `ruff.toml` file as well as fixing and ignoring some additional checks. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-02 06:46:45 -08:00
Xi Yan	3c72c034e6	[remove import ] clean up import 's (#689 ) # What does this PR do? - as title, cleaning up `import `'s - upgrade tests to make them more robust to bad model outputs - remove import 's in llama_stack/apis/* (skip __init__ modules) <img width="465" alt="image" src="https://github.com/user-attachments/assets/d8339c13-3b40-4ba5-9c53-0d2329726ee2" /> - run `sh run_openapi_generator.sh`, no types gets affected ## Test Plan ### Providers Tests agents ``` pytest -v -s llama_stack/providers/tests/agents/test_agents.py -m "together" --safety-shield meta-llama/Llama-Guard-3-8B --inference-model meta-llama/Llama-3.1-405B-Instruct-FP8 ``` inference ```bash # meta-reference torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py # together pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py pytest ./llama_stack/providers/tests/inference/test_prompt_adapter.py ``` safety ``` pytest -v -s llama_stack/providers/tests/safety/test_safety.py -m together --safety-shield meta-llama/Llama-Guard-3-8B ``` memory ``` pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m "sentence_transformers" --env EMBEDDING_DIMENSION=384 ``` scoring ``` pytest -v -s -m llm_as_judge_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py --judge-model meta-llama/Llama-3.2-3B-Instruct pytest -v -s -m basic_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py pytest -v -s -m braintrust_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py ``` datasetio ``` pytest -v -s -m localfs llama_stack/providers/tests/datasetio/test_datasetio.py pytest -v -s -m huggingface llama_stack/providers/tests/datasetio/test_datasetio.py ``` eval ``` pytest -v -s -m meta_reference_eval_together_inference llama_stack/providers/tests/eval/test_eval.py pytest -v -s -m meta_reference_eval_together_inference_huggingface_datasetio llama_stack/providers/tests/eval/test_eval.py ``` ### Client-SDK Tests ``` LLAMA_STACK_BASE_URL=http://localhost:5000 pytest -v ./tests/client-sdk ``` ### llama-stack-apps ``` PORT=5000 LOCALHOST=localhost python -m examples.agents.hello $LOCALHOST $PORT python -m examples.agents.inflation $LOCALHOST $PORT python -m examples.agents.podcast_transcript $LOCALHOST $PORT python -m examples.agents.rag_as_attachments $LOCALHOST $PORT python -m examples.agents.rag_with_memory_bank $LOCALHOST $PORT python -m examples.safety.llama_guard_demo_mm $LOCALHOST $PORT python -m examples.agents.e2e_loop_with_custom_tools $LOCALHOST $PORT # Vision model python -m examples.interior_design_assistant.app python -m examples.agent_store.app $LOCALHOST $PORT ``` ### CLI ``` which llama llama model prompt-format -m Llama3.2-11B-Vision-Instruct llama model list llama stack list-apis llama stack list-providers inference llama stack build --template ollama --image-type conda ``` ### Distributions Tests ollama ``` llama stack build --template ollama --image-type conda ollama run llama3.2:1b-instruct-fp16 llama stack run ./llama_stack/templates/ollama/run.yaml --env INFERENCE_MODEL=meta-llama/Llama-3.2-1B-Instruct ``` fireworks ``` llama stack build --template fireworks --image-type conda llama stack run ./llama_stack/templates/fireworks/run.yaml ``` together ``` llama stack build --template together --image-type conda llama stack run ./llama_stack/templates/together/run.yaml ``` tgi ``` llama stack run ./llama_stack/templates/tgi/run.yaml --env TGI_URL=http://0.0.0.0:5009 --env INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-27 15:45:44 -08:00
Ashwin Bharambe	983d6ce2df	Remove the "ShieldType" concept (#430 ) # What does this PR do? This PR kills the notion of "ShieldType". The impetus for this is the realization: > Why is keyword llama-guard appearing so many times everywhere, sometimes with hyphens, sometimes with underscores? Now that we have a notion of "provider specific resource identifiers" and "user specific aliases" for those and the fact that this works with models ("Llama3.1-8B-Instruct" <> "fireworks/llama-3pv1-..."), we can follow the same rules for Shields. So each Safety provider can make up a notion of identifiers it has registered. This already happens with Bedrock correctly. We just generalize it for Llama Guard, Prompt Guard, etc. For Llama Guard, we further simplify by just adopting the underlying model name itself as the identifier! No confusion necessary. While doing this, I noticed a bug in our DistributionRegistry where we weren't scoping identifiers by type. Fixed. ## Feature/Issue validation/testing/test plan Ran (inference, safety, memory, agents) tests with ollama and fireworks providers.	2024-11-12 12:37:24 -08:00
Dinesh Yeduguru	d800a16acd	Resource oriented design for shields (#399 ) * init * working bedrock tests * bedrock test for inference fixes * use env vars for bedrock guardrail vars * add register in meta reference * use correct shield impl in meta ref * dont add together fixture * right naming * minor updates * improved registration flow * address feedback --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-08 12:16:11 -08:00
Ashwin Bharambe	064d2a5287	Remove the safety adapter for Together; we can just use "meta-reference" (#387 )	2024-11-06 17:36:57 -08:00
Ashwin Bharambe	994732e2e0	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00

12 commits