llama-stack

forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe 0d2eb3bd25 Use inference APIs for running llama guard Test Plan: First, start a TGI container with `meta-llama/Llama-Guard-3-8B` model serving on port 5099. See https://github.com/meta-llama/llama-stack/pull/53 and its description for how. Then run llama-stack with the following run config: ``` image_name: safety docker_image: null conda_env: safety apis_to_serve: - models - inference - shields - safety api_providers: inference: providers: - remote::tgi safety: providers: - meta-reference telemetry: provider_id: meta-reference config: {} routing_table: inference: - provider_id: remote::tgi config: url: http://localhost:5099 api_token: null hf_endpoint_name: null routing_key: Llama-Guard-3-8B safety: - provider_id: meta-reference config: llama_guard_shield: model: Llama-Guard-3-8B excluded_categories: [] disable_input_check: false disable_output_check: false prompt_guard_shield: null routing_key: llama_guard ``` Now simply run `python -m llama_stack.apis.safety.client localhost <port>` and check that the llama_guard shield calls run correctly. (The injection_shield calls fail as expected since we have not set up a router for them.)		2024-09-24 17:02:57 -07:00
..
agents	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 )	2024-09-23 14:22:22 -07:00
batch_inference	API Updates (#73 )	2024-09-17 19:51:35 -07:00
common	API Updates (#73 )	2024-09-17 19:51:35 -07:00
dataset	API Updates (#73 )	2024-09-17 19:51:35 -07:00
evals	API Updates (#73 )	2024-09-17 19:51:35 -07:00
inference	Use inference APIs for running llama guard	2024-09-24 17:02:57 -07:00
memory	Respect passed in embedding model	2024-09-24 14:40:28 -07:00
memory_banks	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 )	2024-09-23 14:22:22 -07:00
models	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 )	2024-09-23 14:22:22 -07:00
post_training	API Updates (#73 )	2024-09-17 19:51:35 -07:00
reward_scoring	API Updates (#73 )	2024-09-17 19:51:35 -07:00
safety	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 )	2024-09-23 14:22:22 -07:00
shields	[API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers (#92 )	2024-09-23 14:22:22 -07:00
synthetic_data_generation	API Updates (#73 )	2024-09-17 19:51:35 -07:00
telemetry	API Updates (#73 )	2024-09-17 19:51:35 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00