Yogish Baliga
d7c55f0ad0
fixing model names
2024-09-27 11:41:37 -07:00
Yogish Baliga
2b568a462a
support Llama 3.2 models in Together inference adapter and cleanup Together safety adapter
2024-09-27 11:41:37 -07:00
Yogish Baliga
9bb0c8f4fc
fixing safety inference and safety adapter for new API spec. Pinned the llama_models version to 0.0.24 as the latest version 0.0.35 has the model descriptor name changed. I was getting the missing package error during runtime as well, hence added the dependency to requirements.txt
2024-09-27 11:41:37 -07:00
Bhimraj Yadav
53070e34a3
Update RFC-0001-llama-stack.md ( #134 )
2024-09-27 09:14:36 -07:00
Xi Yan
eb526b4d9b
Update RFC-0001-llama-stack.md
2024-09-26 17:17:08 -07:00
Moritz Althaus
6b0805ebb4
fix: 404 link to agentic system repository ( #118 )
2024-09-26 14:43:41 -07:00
Deep Doshi
557ae38289
Update getting_started.ipynb ( #117 )
...
Update hyperlink to `llama-stack-apps` to point it correctly to the desired github repo
2024-09-26 14:43:04 -07:00
Xi Yan
2802ac8e9d
add llama-stack.png
2024-09-26 11:17:46 -07:00
Karthi Keyan
995a1a1d00
Reordered pip install and llama model download ( #112 )
...
Only after pip install step, llama cli command could be used (which is also specified in the notebook), so its common sense to put it before
2024-09-26 10:37:15 -07:00
Mark Sze
3c99f08267
minor typo and HuggingFace -> Hugging Face ( #113 )
2024-09-26 09:48:23 -07:00
Kate Plawiak
3ae1597b9b
load models using hf model id ( #108 )
2024-09-25 18:40:09 -07:00
JC (Jonathan Chen)
e73e9110b7
docs: fix typo ( #107 )
2024-09-25 18:36:31 -07:00
Xi Yan
d0280138ef
Update README.md
2024-09-25 17:29:17 -07:00
Xi Yan
ca7602a642
fix #100
2024-09-25 15:11:56 -07:00
machina-source
37be3fb184
Fix links & format ( #104 )
...
Fix broken examples link to llama-stack-apps repo
Remove extra space in README.md
2024-09-25 14:18:46 -07:00
Lucain
615ed4bfbc
Make TGI adapter compatible with HF Inference API ( #97 )
2024-09-25 14:08:31 -07:00
Abhishek
851c30597a
chore (doc): fix typo for setup instructionllama-stack
to llama-stack-apps
( #103 )
2024-09-25 13:27:55 -07:00
Ashwin Bharambe
c8fa26482d
Bump version to 0.0.36
2024-09-25 11:58:15 -07:00
raghotham
baf7bb47b9
Update README.md
2024-09-25 11:45:47 -07:00
Xi Yan
82f420c4f0
fix safety using inference ( #99 )
2024-09-25 11:30:27 -07:00
Dalton Flanagan
5c4f73d52f
Drop header from LocalInference.h
2024-09-25 11:27:37 -07:00
Ashwin Bharambe
d442af0818
Add safety impl for llama guard vision
2024-09-25 11:07:19 -07:00
Dalton Flanagan
b3b0349931
Update LocalInference to use public repos
2024-09-25 11:05:51 -07:00
Ashwin Bharambe
4fcda00872
Re-apply revert
2024-09-25 11:00:43 -07:00
Ashwin Bharambe
d82a9d94e3
Small fix to the prompt-format error message
2024-09-25 10:56:13 -07:00
Ashwin Bharambe
a227edb480
Bump version to 0.0.35
2024-09-25 10:34:59 -07:00
Ashwin Bharambe
56aed59eb4
Support for Llama3.2 models and Swift SDK ( #98 )
2024-09-25 10:29:58 -07:00
poegej
95abbf576b
Bump version to 0.0.24 ( #94 )
...
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-09-25 09:31:12 -07:00
Ashwin Bharambe
ed8d10775a
Remove key
2024-09-25 05:53:49 -07:00
Xi Yan
45be9f3b85
fix agent's embedding model config
2024-09-24 22:49:49 -07:00
Ashwin Bharambe
f45705cd10
Some lightweight cleanup and renaming for bedrock safety adapter
2024-09-24 19:29:56 -07:00
Ashwin Bharambe
a2465f3f9c
Revert parts of 0d2eb3bd25
2024-09-24 19:20:51 -07:00
rsgrewal-aws
059e50b389
[aws-bedrock] Support for Bedrock Safety adapter ( #96 )
2024-09-24 19:16:55 -07:00
Yogish Baliga
b85d675c6f
Adding safety adapter for Together
2024-09-24 18:35:48 -07:00
Ashwin Bharambe
0d2eb3bd25
Use inference APIs for running llama guard
...
Test Plan:
First, start a TGI container with `meta-llama/Llama-Guard-3-8B` model
serving on port 5099. See https://github.com/meta-llama/llama-stack/pull/53 and its
description for how.
Then run llama-stack with the following run config:
```
image_name: safety
docker_image: null
conda_env: safety
apis_to_serve:
- models
- inference
- shields
- safety
api_providers:
inference:
providers:
- remote::tgi
safety:
providers:
- meta-reference
telemetry:
provider_id: meta-reference
config: {}
routing_table:
inference:
- provider_id: remote::tgi
config:
url: http://localhost:5099
api_token: null
hf_endpoint_name: null
routing_key: Llama-Guard-3-8B
safety:
- provider_id: meta-reference
config:
llama_guard_shield:
model: Llama-Guard-3-8B
excluded_categories: []
disable_input_check: false
disable_output_check: false
prompt_guard_shield: null
routing_key: llama_guard
```
Now simply run `python -m llama_stack.apis.safety.client localhost
<port>` and check that the llama_guard shield calls run correctly. (The
injection_shield calls fail as expected since we have not set up a
router for them.)
2024-09-24 17:02:57 -07:00
Xi Yan
c4534217c8
fix cli describe
2024-09-24 14:41:19 -07:00
Ashwin Bharambe
00352bd251
Respect passed in embedding model
2024-09-24 14:40:28 -07:00
Ashwin Bharambe
bda974e660
Make the "all-remote" distribution lightweight in dependencies and size
2024-09-24 14:18:57 -07:00
Ashwin Bharambe
445536de64
Add httpx to core server deps
2024-09-24 10:42:04 -07:00
Ashwin Bharambe
7b35a4c827
Bump version to 0.0.24
2024-09-24 10:15:20 -07:00
Ashwin Bharambe
8d511cdf91
Make build_conda_env a bit more robust
2024-09-24 10:12:07 -07:00
Ashwin Bharambe
cd850c16de
Bump version to 0.0.23
2024-09-24 09:08:40 -07:00
Xi Yan
d04cd97aba
remove providers/impls/sqlite/*
2024-09-24 01:03:40 -07:00
Ashwin Bharambe
e617273d8c
attribute changed (model_args -> arch_args)
2024-09-23 21:44:26 -07:00
Ashwin Bharambe
f136f802b1
Somewhat better error handling
2024-09-23 21:40:14 -07:00
Xi Yan
f92ff86b96
fix shields in agents safety
2024-09-23 21:22:22 -07:00
Ashwin Bharambe
c9005e95ed
Another attempt at a proper bugfix for safety violations
2024-09-23 19:06:30 -07:00
Xi Yan
e5bdd6615a
bug fix for safety violation
2024-09-23 18:17:15 -07:00
Xi Yan
70fb70a71c
fix URL issue with agents
2024-09-23 16:44:25 -07:00
Ashwin Bharambe
9eb5ec3e4b
Bump version to 0.0.21
2024-09-23 14:23:21 -07:00