llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-06 02:30:58 +00:00

Author	SHA1	Message	Date
Russell Bryant	204eb6d810	docker: Check for selinux before using `--security-opt` (#167 ) Before using `--security-opt label=disable`, check that SELinux is enabled. Otherwise, the option is not relevant. This fixes errors on Mac. Closes #166 Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-10-02 10:37:41 -07:00
Ashwin Bharambe	227b69e6e6	Fix sample memory impl	2024-10-02 10:13:09 -07:00
Ashwin Bharambe	335dea849a	fix sample impls	2024-10-02 10:10:31 -07:00
Ashwin Bharambe	bf0d111c53	Fix build script	2024-10-02 10:04:23 -07:00
Ashwin Bharambe	4a75d922a9	Make Llama Guard 1B the default	2024-10-02 09:48:26 -07:00
Ashwin Bharambe	cc5029a716	Add special case for prompt guard	2024-10-02 08:43:12 -07:00
Ashwin Bharambe	eb2d8a31a5	Add a RoutableProvider protocol, support for multiple routing keys (#163 ) * Update configure.py to use multiple routing keys for safety * Refactor distribution/datatypes into a providers/datatypes * Cleanup	2024-09-30 17:30:21 -07:00
Xi Yan	73decb3781	re-build from name	2024-09-30 16:22:52 -07:00
Xi Yan	4897bf2f85	allow --name to re-build from config	2024-09-30 16:18:12 -07:00
Xi Yan	d28c3dfe0f	[CLI] simplify docker run (#159 ) * bake run.yaml inside docker, simplify run * add docker template examples * delete generated Dockerfile * unique deps * clean up debug * default entrypoint * address comments, update output msg * update msg * build output msg * configure msg * unique special_deps * remove quotes in configure	2024-09-30 15:04:04 -07:00
Russell Bryant	8db49de961	docker: Install in editable mode for dev purposes (#160 ) While rebuilding a stack using the `docker` image type and having `LLAMA_STACK_DIR` set so it installs `llama_stack` from my local source, I noticed that once built, it just used the image build cache and didn't pull in changes to my source. 1. Install in editable mode (`pip install -e`) for dev purposes. 2. Mount the source into the container for `configure` and `run` so that the editable install works. Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-09-30 11:56:31 -07:00
Russell Bryant	cb36be320f	Fix podman+selinux compatibility (#132 ) When I ran `llama stack configure` for my `docker` based stack on my system using podman + SELinux (CentOS Stream 9), The `podman run` command failed due to SELinux blocking access to the volume mount. As a simple fix, disable SELinux label checking. Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-09-29 20:19:44 -07:00
moritalous	2bd785354d	fix broken bedrock inference provider (#151 )	2024-09-29 20:17:58 -07:00
Byung Chun Kim	2f096ca509	accepts not model itself. (#153 )	2024-09-29 20:16:50 -07:00
Ashwin Bharambe	5bf679cab6	Pull (extract) provider data from the provider instead of pushing from the top (#148 )	2024-09-29 20:00:51 -07:00
Xi Yan	f6a6598d1a	[bugfix] fix #146 (#147 ) * more robust image type * lint	2024-09-28 17:47:00 -07:00
Xi Yan	6a8c2ae1df	[CLI] remove dependency on CONDA_PREFIX in CLI (#144 ) * remove dependency on CONDA_PREFIX in CLI * lint * typo * more robust	2024-09-28 16:46:47 -07:00
Ashwin Bharambe	fe460ba103	Avoid importing a lot of stuff	2024-09-28 16:06:10 -07:00
Xi Yan	4ae8c63a2b	pre-commit lint	2024-09-28 16:04:41 -07:00
Ashwin Bharambe	ced5fb6388	Small cleanup for together safety implementation	2024-09-28 15:47:35 -07:00
Yogish Baliga	940968ee3f	fixing safety inference and safety adapter for new API spec. Pinned t… (#105 ) * fixing safety inference and safety adapter for new API spec. Pinned the llama_models version to 0.0.24 as the latest version 0.0.35 has the model descriptor name changed. I was getting the missing package error during runtime as well, hence added the dependency to requirements.txt * support Llama 3.2 models in Together inference adapter and cleanup Together safety adapter * fixing model names * adding vision guard to Together safety	2024-09-28 15:45:38 -07:00
Ashwin Bharambe	0a3999a9a4	Use inference APIs for executing Llama Guard (#121 ) We should use Inference APIs to execute Llama Guard instead of directly needing to use HuggingFace modeling related code. The actual inference consideration is handled by Inference.	2024-09-28 15:40:06 -07:00
Xi Yan	6236634d84	[bugfix] fix duplicate api endpoints (#139 ) * fix server api to serve * remove print	2024-09-27 15:32:50 -07:00
Xi Yan	208b861289	add env for LLAMA_STACK_CONFIG_DIR (#137 )	2024-09-27 14:16:46 -07:00
Russell Bryant	f70c88ab7a	configure: Fix a error msg typo (#131 ) I got this error message and noticed the typo in the message. It directed the user to run `llama stack build first`, which is not a valid command. Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-09-27 14:00:25 -07:00
Russell Bryant	5828ffd53b	inference: Fix download command in error msg (#133 ) I got this error message and tried to the run the command presented and it didn't work. The model needs to be give with `--model-id` instead of as a positional argument. Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-09-27 13:31:11 -07:00
Russell Bryant	fb9e6371ec	Validate `name` in `llama stack build` (#128 ) The first time I ran `llama stack build`, I quickly hit enter at the first prompt asking for a name, assuming it would use the default given in the help text. This caused a failure later on that wasn't very obvious. I was using the `docker` format and a blank name caused an invalid tag format that failed the image build. This change adds validation for the `name` parameter to ensure it's not empty before proceeding. Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-09-27 13:30:55 -07:00
Mark Sze	3c99f08267	minor typo and HuggingFace -> Hugging Face (#113 )	2024-09-26 09:48:23 -07:00
Kate Plawiak	3ae1597b9b	load models using hf model id (#108 )	2024-09-25 18:40:09 -07:00
Xi Yan	ca7602a642	fix #100	2024-09-25 15:11:56 -07:00
Lucain	615ed4bfbc	Make TGI adapter compatible with HF Inference API (#97 )	2024-09-25 14:08:31 -07:00
Xi Yan	82f420c4f0	fix safety using inference (#99 )	2024-09-25 11:30:27 -07:00
Dalton Flanagan	5c4f73d52f	Drop header from LocalInference.h	2024-09-25 11:27:37 -07:00
Ashwin Bharambe	d442af0818	Add safety impl for llama guard vision	2024-09-25 11:07:19 -07:00
Dalton Flanagan	b3b0349931	Update LocalInference to use public repos	2024-09-25 11:05:51 -07:00
Ashwin Bharambe	4fcda00872	Re-apply revert	2024-09-25 11:00:43 -07:00
Ashwin Bharambe	d82a9d94e3	Small fix to the prompt-format error message	2024-09-25 10:56:13 -07:00
Ashwin Bharambe	56aed59eb4	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
poegej	95abbf576b	Bump version to 0.0.24 (#94 ) Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-09-25 09:31:12 -07:00
Ashwin Bharambe	ed8d10775a	Remove key	2024-09-25 05:53:49 -07:00
Xi Yan	45be9f3b85	fix agent's embedding model config	2024-09-24 22:49:49 -07:00
Ashwin Bharambe	f45705cd10	Some lightweight cleanup and renaming for bedrock safety adapter	2024-09-24 19:29:56 -07:00
Ashwin Bharambe	a2465f3f9c	Revert parts of `0d2eb3bd25`	2024-09-24 19:20:51 -07:00
rsgrewal-aws	059e50b389	[aws-bedrock] Support for Bedrock Safety adapter (#96 )	2024-09-24 19:16:55 -07:00
Yogish Baliga	b85d675c6f	Adding safety adapter for Together	2024-09-24 18:35:48 -07:00
Ashwin Bharambe	0d2eb3bd25	Use inference APIs for running llama guard Test Plan: First, start a TGI container with `meta-llama/Llama-Guard-3-8B` model serving on port 5099. See https://github.com/meta-llama/llama-stack/pull/53 and its description for how. Then run llama-stack with the following run config: ``` image_name: safety docker_image: null conda_env: safety apis_to_serve: - models - inference - shields - safety api_providers: inference: providers: - remote::tgi safety: providers: - meta-reference telemetry: provider_id: meta-reference config: {} routing_table: inference: - provider_id: remote::tgi config: url: http://localhost:5099 api_token: null hf_endpoint_name: null routing_key: Llama-Guard-3-8B safety: - provider_id: meta-reference config: llama_guard_shield: model: Llama-Guard-3-8B excluded_categories: [] disable_input_check: false disable_output_check: false prompt_guard_shield: null routing_key: llama_guard ``` Now simply run `python -m llama_stack.apis.safety.client localhost <port>` and check that the llama_guard shield calls run correctly. (The injection_shield calls fail as expected since we have not set up a router for them.)	2024-09-24 17:02:57 -07:00
Xi Yan	c4534217c8	fix cli describe	2024-09-24 14:41:19 -07:00
Ashwin Bharambe	00352bd251	Respect passed in embedding model	2024-09-24 14:40:28 -07:00
Ashwin Bharambe	bda974e660	Make the "all-remote" distribution lightweight in dependencies and size	2024-09-24 14:18:57 -07:00
Ashwin Bharambe	445536de64	Add httpx to core server deps	2024-09-24 10:42:04 -07:00

1 2 3 4

188 commits