llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-06 18:40:57 +00:00

Author	SHA1	Message	Date
Yogish Baliga	d7c55f0ad0	fixing model names	2024-09-27 11:41:37 -07:00
Yogish Baliga	2b568a462a	support Llama 3.2 models in Together inference adapter and cleanup Together safety adapter	2024-09-27 11:41:37 -07:00
Yogish Baliga	9bb0c8f4fc	fixing safety inference and safety adapter for new API spec. Pinned the llama_models version to 0.0.24 as the latest version 0.0.35 has the model descriptor name changed. I was getting the missing package error during runtime as well, hence added the dependency to requirements.txt	2024-09-27 11:41:37 -07:00
Bhimraj Yadav	53070e34a3	Update RFC-0001-llama-stack.md (#134 )	2024-09-27 09:14:36 -07:00
Xi Yan	eb526b4d9b	Update RFC-0001-llama-stack.md	2024-09-26 17:17:08 -07:00
Moritz Althaus	6b0805ebb4	fix: 404 link to agentic system repository (#118 )	2024-09-26 14:43:41 -07:00
Deep Doshi	557ae38289	Update getting_started.ipynb (#117 ) Update hyperlink to `llama-stack-apps` to point it correctly to the desired github repo	2024-09-26 14:43:04 -07:00
Xi Yan	2802ac8e9d	add llama-stack.png	2024-09-26 11:17:46 -07:00
Karthi Keyan	995a1a1d00	Reordered pip install and llama model download (#112 ) Only after pip install step, llama cli command could be used (which is also specified in the notebook), so its common sense to put it before	2024-09-26 10:37:15 -07:00
Mark Sze	3c99f08267	minor typo and HuggingFace -> Hugging Face (#113 )	2024-09-26 09:48:23 -07:00
Kate Plawiak	3ae1597b9b	load models using hf model id (#108 )	2024-09-25 18:40:09 -07:00
JC (Jonathan Chen)	e73e9110b7	docs: fix typo (#107 )	2024-09-25 18:36:31 -07:00
Xi Yan	d0280138ef	Update README.md	2024-09-25 17:29:17 -07:00
Xi Yan	ca7602a642	fix #100	2024-09-25 15:11:56 -07:00
machina-source	37be3fb184	Fix links & format (#104 ) Fix broken examples link to llama-stack-apps repo Remove extra space in README.md	2024-09-25 14:18:46 -07:00
Lucain	615ed4bfbc	Make TGI adapter compatible with HF Inference API (#97 )	2024-09-25 14:08:31 -07:00
Abhishek	851c30597a	chore (doc): fix typo for setup instruction`llama-stack` to `llama-stack-apps` (#103 )	2024-09-25 13:27:55 -07:00
Ashwin Bharambe	c8fa26482d	Bump version to 0.0.36	2024-09-25 11:58:15 -07:00
raghotham	baf7bb47b9	Update README.md	2024-09-25 11:45:47 -07:00
Xi Yan	82f420c4f0	fix safety using inference (#99 )	2024-09-25 11:30:27 -07:00
Dalton Flanagan	5c4f73d52f	Drop header from LocalInference.h	2024-09-25 11:27:37 -07:00
Ashwin Bharambe	d442af0818	Add safety impl for llama guard vision	2024-09-25 11:07:19 -07:00
Dalton Flanagan	b3b0349931	Update LocalInference to use public repos	2024-09-25 11:05:51 -07:00
Ashwin Bharambe	4fcda00872	Re-apply revert	2024-09-25 11:00:43 -07:00
Ashwin Bharambe	d82a9d94e3	Small fix to the prompt-format error message	2024-09-25 10:56:13 -07:00
Ashwin Bharambe	a227edb480	Bump version to 0.0.35	2024-09-25 10:34:59 -07:00
Ashwin Bharambe	56aed59eb4	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
poegej	95abbf576b	Bump version to 0.0.24 (#94 ) Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-09-25 09:31:12 -07:00
Ashwin Bharambe	ed8d10775a	Remove key	2024-09-25 05:53:49 -07:00
Xi Yan	45be9f3b85	fix agent's embedding model config	2024-09-24 22:49:49 -07:00
Ashwin Bharambe	f45705cd10	Some lightweight cleanup and renaming for bedrock safety adapter	2024-09-24 19:29:56 -07:00
Ashwin Bharambe	a2465f3f9c	Revert parts of `0d2eb3bd25`	2024-09-24 19:20:51 -07:00
rsgrewal-aws	059e50b389	[aws-bedrock] Support for Bedrock Safety adapter (#96 )	2024-09-24 19:16:55 -07:00
Yogish Baliga	b85d675c6f	Adding safety adapter for Together	2024-09-24 18:35:48 -07:00
Ashwin Bharambe	0d2eb3bd25	Use inference APIs for running llama guard Test Plan: First, start a TGI container with `meta-llama/Llama-Guard-3-8B` model serving on port 5099. See https://github.com/meta-llama/llama-stack/pull/53 and its description for how. Then run llama-stack with the following run config: ``` image_name: safety docker_image: null conda_env: safety apis_to_serve: - models - inference - shields - safety api_providers: inference: providers: - remote::tgi safety: providers: - meta-reference telemetry: provider_id: meta-reference config: {} routing_table: inference: - provider_id: remote::tgi config: url: http://localhost:5099 api_token: null hf_endpoint_name: null routing_key: Llama-Guard-3-8B safety: - provider_id: meta-reference config: llama_guard_shield: model: Llama-Guard-3-8B excluded_categories: [] disable_input_check: false disable_output_check: false prompt_guard_shield: null routing_key: llama_guard ``` Now simply run `python -m llama_stack.apis.safety.client localhost <port>` and check that the llama_guard shield calls run correctly. (The injection_shield calls fail as expected since we have not set up a router for them.)	2024-09-24 17:02:57 -07:00
Xi Yan	c4534217c8	fix cli describe	2024-09-24 14:41:19 -07:00
Ashwin Bharambe	00352bd251	Respect passed in embedding model	2024-09-24 14:40:28 -07:00
Ashwin Bharambe	bda974e660	Make the "all-remote" distribution lightweight in dependencies and size	2024-09-24 14:18:57 -07:00
Ashwin Bharambe	445536de64	Add httpx to core server deps	2024-09-24 10:42:04 -07:00
Ashwin Bharambe	7b35a4c827	Bump version to 0.0.24	2024-09-24 10:15:20 -07:00
Ashwin Bharambe	8d511cdf91	Make build_conda_env a bit more robust	2024-09-24 10:12:07 -07:00
Ashwin Bharambe	cd850c16de	Bump version to 0.0.23	2024-09-24 09:08:40 -07:00
Xi Yan	d04cd97aba	remove providers/impls/sqlite/*	2024-09-24 01:03:40 -07:00
Ashwin Bharambe	e617273d8c	attribute changed (model_args -> arch_args)	2024-09-23 21:44:26 -07:00
Ashwin Bharambe	f136f802b1	Somewhat better error handling	2024-09-23 21:40:14 -07:00
Xi Yan	f92ff86b96	fix shields in agents safety	2024-09-23 21:22:22 -07:00
Ashwin Bharambe	c9005e95ed	Another attempt at a proper bugfix for safety violations	2024-09-23 19:06:30 -07:00
Xi Yan	e5bdd6615a	bug fix for safety violation	2024-09-23 18:17:15 -07:00
Xi Yan	70fb70a71c	fix URL issue with agents	2024-09-23 16:44:25 -07:00
Ashwin Bharambe	9eb5ec3e4b	Bump version to 0.0.21	2024-09-23 14:23:21 -07:00

1 2 3 4 5

207 commits