llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
Dalton Flanagan	2d4f7d8acf	Create SECURITY.md	2024-10-08 13:30:40 -04:00
Yuan Tang	48d0d2001e	Add classifiers in setup.py (#217 ) * Add classifiers in setup.py * Update setup.py * Update setup.py	2024-10-08 06:55:16 -07:00
Xi Yan	4d5f7459aa	[bugfix] Fix logprobs on meta-reference impl (#213 ) * fix log probs * add back LogProbsConfig * error handling * bugfix	2024-10-07 19:42:39 -07:00
Yuan Tang	e4ae09d090	Add .idea to .gitignore (#216 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2024-10-07 19:38:43 -07:00
Xi Yan	16ba0fa06f	Update README.md	2024-10-07 11:24:27 -07:00
Russell Bryant	996efa9b42	README.md: Add vLLM to providers table (#207 ) Signed-off-by: Russell Bryant <russell.bryant@gmail.com>	2024-10-07 10:26:52 -07:00
Xi Yan	2366e18873	refactor docs (#209 )	2024-10-07 10:21:26 -07:00
Mindaugas	53d440e952	Fix ValueError in case chunks are empty (#206 )	2024-10-07 08:55:06 -07:00
Russell Bryant	a4e775c465	download: improve help text (#204 )	2024-10-07 08:40:04 -07:00
Ashwin Bharambe	4263764493	Fix adapter_id -> adapter_type for Weaviate	2024-10-07 06:46:32 -07:00
Zain Hasan	f4f7618120	add Weaviate memory adapter (#95 )	2024-10-06 22:21:50 -07:00
Xi Yan	27587f32bc	fix db path	2024-10-06 11:46:08 -07:00
Xi Yan	cfe3ad33b3	fix db path	2024-10-06 11:45:35 -07:00
Prithu Dasgupta	7abab7604b	add databricks provider (#83 ) * add databricks provider * update provider and test	2024-10-05 23:35:54 -07:00
Russell Bryant	f73e247ba1	Inline vLLM inference provider (#181 ) This is just like `local` using `meta-reference` for everything except it uses `vllm` for inference. Docker works, but So far, `conda` is a bit easier to use with the vllm provider. The default container base image does not include all the necessary libraries for all vllm features. More cuda dependencies are necessary. I started changing this base image used in this template, but it also required changes to the Dockerfile, so it was getting too involved to include in the first PR. Working so far: * `python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream True` * `python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream False` Example: ``` $ python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream False User>hello world, write me a 2 sentence poem about the moon Assistant> The moon glows bright in the midnight sky A beacon of light, ``` I have only tested these models: * `Llama3.1-8B-Instruct` - across 4 GPUs (tensor_parallel_size = 4) * `Llama3.2-1B-Instruct` - on a single GPU (tensor_parallel_size = 1)	2024-10-05 23:34:16 -07:00
Xi Yan	29138a5167	Update getting_started.md	2024-10-05 12:28:02 -07:00
Xi Yan	6d4013ac99	Update getting_started.md	2024-10-05 12:14:59 -07:00
Mindaugas	9d16129603	Add 'url' property to Redis KV config (#192 )	2024-10-05 11:26:26 -07:00
Ashwin Bharambe	bfb0e92034	Bump version to 0.0.40	2024-10-04 09:33:43 -07:00
Ashwin Bharambe	dc75aab547	Add setuptools dependency	2024-10-04 09:30:54 -07:00
Dalton Flanagan	441052b0fd	avoid jq since non-standard on macOS	2024-10-04 10:11:43 -04:00
Dalton Flanagan	9bf2e354ae	CLI now requires jq	2024-10-04 10:05:59 -04:00
raghotham	00ed9a410b	Update getting_started.md update discord invite link	2024-10-03 23:28:43 -07:00
AshleyT3	734f59d3b8	Check that the model is found before use. (#182 )	2024-10-03 23:24:47 -07:00
Ashwin Bharambe	f913b57397	fix fp8 imports	2024-10-03 14:40:21 -07:00
Ashwin Bharambe	8d41e6caa9	Bump version to 0.0.39	2024-10-03 11:31:03 -07:00
Ashwin Bharambe	7f49315822	Kill a derpy import	2024-10-03 11:25:58 -07:00
Xi Yan	62d266f018	[CLI] avoid configure twice (#171 ) * avoid configure twice * cleanup tmp config * update output msg * address comment * update msg * script update	2024-10-03 11:20:54 -07:00
Russell Bryant	06db9213b1	inference: Add model option to client (#170 ) I was running this client for testing purposes and being able to specify which model to use is a convenient addition. This change makes that possible.	2024-10-03 11:18:57 -07:00
Ashwin Bharambe	210b71b0ba	fix prompt guard (#177 ) Several other fixes to configure. Add support for 1b/3b models in ollama.	2024-10-03 11:07:53 -07:00
Xi Yan	b9b1e8b08b	[bugfix] conda path lookup (#179 ) * fix conda lookup * comments	2024-10-03 10:45:16 -07:00
raghotham	d74501f75c	Update README.md Added pypi package version	2024-10-03 10:21:16 -07:00
Ashwin Bharambe	c02a90e4c8	Bump version to 0.0.38	2024-10-03 05:42:47 -07:00
Ashwin Bharambe	e9f6150588	A bit cleanup to avoid breakages	2024-10-02 21:31:09 -07:00
Ashwin Bharambe	988a9cada3	Don't ask for Api.inspect in stack build	2024-10-02 21:10:56 -07:00
Ashwin Bharambe	19ce6bf009	Don't validate prompt-guard anymore	2024-10-02 20:43:57 -07:00
Xi Yan	703ab9385f	fix routing table key list	2024-10-02 18:23:31 -07:00
Ashwin Bharambe	8d049000e3	Add an introspection "Api.inspect" API	2024-10-02 15:41:14 -07:00
Adrian Cole	01d93be948	Adds markdown-link-check and fixes a broken link (#165 ) Signed-off-by: Adrian Cole <adrian.cole@elastic.co> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-10-02 14:26:20 -07:00
Ashwin Bharambe	fe4aabd690	provider_id => provider_type, adapter_id => adapter_type	2024-10-02 14:05:59 -07:00
Ashwin Bharambe	df68db644b	Refactoring distribution/distribution.py This file was becoming too large and unclear what it housed. Split it into pieces.	2024-10-02 14:03:02 -07:00
Ashwin Bharambe	546f05bd3f	No automatic pager	2024-10-02 12:26:09 -07:00
Russell Bryant	204eb6d810	docker: Check for selinux before using `--security-opt` (#167 ) Before using `--security-opt label=disable`, check that SELinux is enabled. Otherwise, the option is not relevant. This fixes errors on Mac. Closes #166 Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-10-02 10:37:41 -07:00
Ashwin Bharambe	9b93ee2c2b	Bump version to 0.0.37	2024-10-02 10:15:08 -07:00
Ashwin Bharambe	227b69e6e6	Fix sample memory impl	2024-10-02 10:13:09 -07:00
Ashwin Bharambe	335dea849a	fix sample impls	2024-10-02 10:10:31 -07:00
Ashwin Bharambe	bf0d111c53	Fix build script	2024-10-02 10:04:23 -07:00
Ashwin Bharambe	4a75d922a9	Make Llama Guard 1B the default	2024-10-02 09:48:26 -07:00
Ashwin Bharambe	cc5029a716	Add special case for prompt guard	2024-10-02 08:43:12 -07:00
Ashwin Bharambe	a80b707ff8	Ensure we always ask for pydantic>=2	2024-10-02 06:29:06 -07:00

1 2 3 4 5 ...

278 commits