Commit graph

188 commits

Author SHA1 Message Date
Russell Bryant
204eb6d810
docker: Check for selinux before using --security-opt (#167)
Before using `--security-opt label=disable`, check that SELinux is
enabled. Otherwise, the option is not relevant.

This fixes errors on Mac.

Closes #166

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-02 10:37:41 -07:00
Ashwin Bharambe
227b69e6e6 Fix sample memory impl 2024-10-02 10:13:09 -07:00
Ashwin Bharambe
335dea849a fix sample impls 2024-10-02 10:10:31 -07:00
Ashwin Bharambe
bf0d111c53 Fix build script 2024-10-02 10:04:23 -07:00
Ashwin Bharambe
4a75d922a9 Make Llama Guard 1B the default 2024-10-02 09:48:26 -07:00
Ashwin Bharambe
cc5029a716 Add special case for prompt guard 2024-10-02 08:43:12 -07:00
Ashwin Bharambe
eb2d8a31a5
Add a RoutableProvider protocol, support for multiple routing keys (#163)
* Update configure.py to use multiple routing keys for safety
* Refactor distribution/datatypes into a providers/datatypes
* Cleanup
2024-09-30 17:30:21 -07:00
Xi Yan
73decb3781 re-build from name 2024-09-30 16:22:52 -07:00
Xi Yan
4897bf2f85 allow --name to re-build from config 2024-09-30 16:18:12 -07:00
Xi Yan
d28c3dfe0f
[CLI] simplify docker run (#159)
* bake run.yaml inside docker, simplify run

* add docker template examples

* delete generated Dockerfile

* unique deps

* clean up debug

* default entrypoint

* address comments, update output msg

* update msg

* build output msg

* configure msg

* unique special_deps

* remove quotes in configure
2024-09-30 15:04:04 -07:00
Russell Bryant
8db49de961
docker: Install in editable mode for dev purposes (#160)
While rebuilding a stack using the `docker` image type and having
`LLAMA_STACK_DIR` set so it installs `llama_stack` from my local
source, I noticed that once built, it just used the image build cache
and didn't pull in changes to my source.

1. Install in editable mode (`pip install -e`) for dev purposes.

2. Mount the source into the container for `configure` and `run` so
   that the editable install works.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-09-30 11:56:31 -07:00
Russell Bryant
cb36be320f
Fix podman+selinux compatibility (#132)
When I ran `llama stack configure` for my `docker` based stack on my
system using podman + SELinux (CentOS Stream 9), The `podman run`
command failed due to SELinux blocking access to the volume mount.

As a simple fix, disable SELinux label checking.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-09-29 20:19:44 -07:00
moritalous
2bd785354d
fix broken bedrock inference provider (#151) 2024-09-29 20:17:58 -07:00
Byung Chun Kim
2f096ca509
accepts not model itself. (#153) 2024-09-29 20:16:50 -07:00
Ashwin Bharambe
5bf679cab6
Pull (extract) provider data from the provider instead of pushing from the top (#148) 2024-09-29 20:00:51 -07:00
Xi Yan
f6a6598d1a
[bugfix] fix #146 (#147)
* more robust image type

* lint
2024-09-28 17:47:00 -07:00
Xi Yan
6a8c2ae1df
[CLI] remove dependency on CONDA_PREFIX in CLI (#144)
* remove dependency on CONDA_PREFIX in CLI

* lint

* typo

* more robust
2024-09-28 16:46:47 -07:00
Ashwin Bharambe
fe460ba103 Avoid importing a lot of stuff 2024-09-28 16:06:10 -07:00
Xi Yan
4ae8c63a2b pre-commit lint 2024-09-28 16:04:41 -07:00
Ashwin Bharambe
ced5fb6388 Small cleanup for together safety implementation 2024-09-28 15:47:35 -07:00
Yogish Baliga
940968ee3f
fixing safety inference and safety adapter for new API spec. Pinned t… (#105)
* fixing safety inference and safety adapter for new API spec. Pinned the llama_models version to 0.0.24 as the latest version 0.0.35 has the model descriptor name changed. I was getting the missing package error during runtime as well, hence added the dependency to requirements.txt

* support Llama 3.2 models in Together inference adapter and cleanup Together safety adapter

* fixing model names

* adding vision guard to Together safety
2024-09-28 15:45:38 -07:00
Ashwin Bharambe
0a3999a9a4
Use inference APIs for executing Llama Guard (#121)
We should use Inference APIs to execute Llama Guard instead of directly needing to use HuggingFace modeling related code. The actual inference consideration is handled by Inference.
2024-09-28 15:40:06 -07:00
Xi Yan
6236634d84
[bugfix] fix duplicate api endpoints (#139)
* fix server api to serve

* remove print
2024-09-27 15:32:50 -07:00
Xi Yan
208b861289
add env for LLAMA_STACK_CONFIG_DIR (#137) 2024-09-27 14:16:46 -07:00
Russell Bryant
f70c88ab7a
configure: Fix a error msg typo (#131)
I got this error message and noticed the typo in the message. It
directed the user to run `llama stack build first`, which is not a
valid command.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-09-27 14:00:25 -07:00
Russell Bryant
5828ffd53b
inference: Fix download command in error msg (#133)
I got this error message and tried to the run the command presented
and it didn't work. The model needs to be give with `--model-id`
instead of as a positional argument.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-09-27 13:31:11 -07:00
Russell Bryant
fb9e6371ec
Validate name in llama stack build (#128)
The first time I ran `llama stack build`, I quickly hit enter at the
first prompt asking for a name, assuming it would use the default
given in the help text. This caused a failure later on that wasn't
very obvious. I was using the `docker` format and a blank name caused
an invalid tag format that failed the image build.

This change adds validation for the `name` parameter to ensure it's
not empty before proceeding.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-09-27 13:30:55 -07:00
Mark Sze
3c99f08267
minor typo and HuggingFace -> Hugging Face (#113) 2024-09-26 09:48:23 -07:00
Kate Plawiak
3ae1597b9b
load models using hf model id (#108) 2024-09-25 18:40:09 -07:00
Xi Yan
ca7602a642 fix #100 2024-09-25 15:11:56 -07:00
Lucain
615ed4bfbc
Make TGI adapter compatible with HF Inference API (#97) 2024-09-25 14:08:31 -07:00
Xi Yan
82f420c4f0
fix safety using inference (#99) 2024-09-25 11:30:27 -07:00
Dalton Flanagan
5c4f73d52f
Drop header from LocalInference.h 2024-09-25 11:27:37 -07:00
Ashwin Bharambe
d442af0818 Add safety impl for llama guard vision 2024-09-25 11:07:19 -07:00
Dalton Flanagan
b3b0349931 Update LocalInference to use public repos 2024-09-25 11:05:51 -07:00
Ashwin Bharambe
4fcda00872 Re-apply revert 2024-09-25 11:00:43 -07:00
Ashwin Bharambe
d82a9d94e3 Small fix to the prompt-format error message 2024-09-25 10:56:13 -07:00
Ashwin Bharambe
56aed59eb4
Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00
poegej
95abbf576b
Bump version to 0.0.24 (#94)
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-09-25 09:31:12 -07:00
Ashwin Bharambe
ed8d10775a Remove key 2024-09-25 05:53:49 -07:00
Xi Yan
45be9f3b85 fix agent's embedding model config 2024-09-24 22:49:49 -07:00
Ashwin Bharambe
f45705cd10 Some lightweight cleanup and renaming for bedrock safety adapter 2024-09-24 19:29:56 -07:00
Ashwin Bharambe
a2465f3f9c Revert parts of 0d2eb3bd25 2024-09-24 19:20:51 -07:00
rsgrewal-aws
059e50b389
[aws-bedrock] Support for Bedrock Safety adapter (#96) 2024-09-24 19:16:55 -07:00
Yogish Baliga
b85d675c6f Adding safety adapter for Together 2024-09-24 18:35:48 -07:00
Ashwin Bharambe
0d2eb3bd25 Use inference APIs for running llama guard
Test Plan:

First, start a TGI container with `meta-llama/Llama-Guard-3-8B` model
serving on port 5099. See https://github.com/meta-llama/llama-stack/pull/53 and its
description for how.

Then run llama-stack with the following run config:

```
image_name: safety
docker_image: null
conda_env: safety
apis_to_serve:
- models
- inference
- shields
- safety
api_providers:
  inference:
    providers:
    - remote::tgi
  safety:
    providers:
    - meta-reference
  telemetry:
    provider_id: meta-reference
    config: {}
routing_table:
  inference:
  - provider_id: remote::tgi
    config:
      url: http://localhost:5099
      api_token: null
      hf_endpoint_name: null
    routing_key: Llama-Guard-3-8B
  safety:
  - provider_id: meta-reference
    config:
      llama_guard_shield:
        model: Llama-Guard-3-8B
        excluded_categories: []
        disable_input_check: false
        disable_output_check: false
      prompt_guard_shield: null
    routing_key: llama_guard
```

Now simply run `python -m llama_stack.apis.safety.client localhost
<port>` and check that the llama_guard shield calls run correctly. (The
injection_shield calls fail as expected since we have not set up a
router for them.)
2024-09-24 17:02:57 -07:00
Xi Yan
c4534217c8 fix cli describe 2024-09-24 14:41:19 -07:00
Ashwin Bharambe
00352bd251 Respect passed in embedding model 2024-09-24 14:40:28 -07:00
Ashwin Bharambe
bda974e660 Make the "all-remote" distribution lightweight in dependencies and size 2024-09-24 14:18:57 -07:00
Ashwin Bharambe
445536de64 Add httpx to core server deps 2024-09-24 10:42:04 -07:00