Xi Yan
5b7d24b1c3
wip
2024-10-07 17:27:06 -07:00
Xi Yan
4764762dd4
tasks registry
2024-10-07 15:57:39 -07:00
Xi Yan
16ba0fa06f
Update README.md
2024-10-07 11:24:27 -07:00
Russell Bryant
996efa9b42
README.md: Add vLLM to providers table ( #207 )
...
Signed-off-by: Russell Bryant <russell.bryant@gmail.com>
2024-10-07 10:26:52 -07:00
Xi Yan
2366e18873
refactor docs ( #209 )
2024-10-07 10:21:26 -07:00
Mindaugas
53d440e952
Fix ValueError in case chunks are empty ( #206 )
2024-10-07 08:55:06 -07:00
Russell Bryant
a4e775c465
download: improve help text ( #204 )
2024-10-07 08:40:04 -07:00
Ashwin Bharambe
4263764493
Fix adapter_id -> adapter_type for Weaviate
2024-10-07 06:46:32 -07:00
Zain Hasan
f4f7618120
add Weaviate memory adapter ( #95 )
2024-10-06 22:21:50 -07:00
Xi Yan
27587f32bc
fix db path
2024-10-06 11:46:08 -07:00
Xi Yan
cfe3ad33b3
fix db path
2024-10-06 11:45:35 -07:00
Prithu Dasgupta
7abab7604b
add databricks provider ( #83 )
...
* add databricks provider
* update provider and test
2024-10-05 23:35:54 -07:00
Russell Bryant
f73e247ba1
Inline vLLM inference provider ( #181 )
...
This is just like `local` using `meta-reference` for everything except
it uses `vllm` for inference.
Docker works, but So far, `conda` is a bit easier to use with the vllm
provider. The default container base image does not include all the
necessary libraries for all vllm features. More cuda dependencies are
necessary.
I started changing this base image used in this template, but it also
required changes to the Dockerfile, so it was getting too involved to
include in the first PR.
Working so far:
* `python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream True`
* `python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream False`
Example:
```
$ python -m llama_stack.apis.inference.client localhost 5000 --model Llama3.2-1B-Instruct --stream False
User>hello world, write me a 2 sentence poem about the moon
Assistant>
The moon glows bright in the midnight sky
A beacon of light,
```
I have only tested these models:
* `Llama3.1-8B-Instruct` - across 4 GPUs (tensor_parallel_size = 4)
* `Llama3.2-1B-Instruct` - on a single GPU (tensor_parallel_size = 1)
2024-10-05 23:34:16 -07:00
Xi Yan
29138a5167
Update getting_started.md
2024-10-05 12:28:02 -07:00
Xi Yan
6d4013ac99
Update getting_started.md
2024-10-05 12:14:59 -07:00
Xi Yan
041634192a
move folder
2024-10-05 11:57:21 -07:00
Mindaugas
9d16129603
Add 'url' property to Redis KV config ( #192 )
2024-10-05 11:26:26 -07:00
Xi Yan
6234dd97d5
eleuther eval provider
2024-10-04 13:45:52 -07:00
Ashwin Bharambe
bfb0e92034
Bump version to 0.0.40
2024-10-04 09:33:43 -07:00
Ashwin Bharambe
dc75aab547
Add setuptools dependency
2024-10-04 09:30:54 -07:00
Dalton Flanagan
441052b0fd
avoid jq since non-standard on macOS
2024-10-04 10:11:43 -04:00
Dalton Flanagan
9bf2e354ae
CLI now requires jq
2024-10-04 10:05:59 -04:00
Xi Yan
2441e66d14
evals api mvp
2024-10-04 00:50:03 -07:00
Xi Yan
3cbe3a72e8
mvp
2024-10-04 00:25:57 -07:00
raghotham
00ed9a410b
Update getting_started.md
...
update discord invite link
2024-10-03 23:28:43 -07:00
AshleyT3
734f59d3b8
Check that the model is found before use. ( #182 )
2024-10-03 23:24:47 -07:00
Xi Yan
4f07aca309
get task
2024-10-03 17:31:46 -07:00
Ashwin Bharambe
f913b57397
fix fp8 imports
2024-10-03 14:40:21 -07:00
Xi Yan
8339b2cef3
wip api
2024-10-03 13:47:15 -07:00
Xi Yan
7143ecfc0d
wip
2024-10-03 11:36:18 -07:00
Ashwin Bharambe
8d41e6caa9
Bump version to 0.0.39
2024-10-03 11:31:03 -07:00
Ashwin Bharambe
7f49315822
Kill a derpy import
2024-10-03 11:25:58 -07:00
Xi Yan
62d266f018
[CLI] avoid configure twice ( #171 )
...
* avoid configure twice
* cleanup tmp config
* update output msg
* address comment
* update msg
* script update
2024-10-03 11:20:54 -07:00
Russell Bryant
06db9213b1
inference: Add model option to client ( #170 )
...
I was running this client for testing purposes and being able to
specify which model to use is a convenient addition. This change makes
that possible.
2024-10-03 11:18:57 -07:00
Xi Yan
5e9301de90
wip
2024-10-03 11:18:23 -07:00
Ashwin Bharambe
210b71b0ba
fix prompt guard ( #177 )
...
Several other fixes to configure. Add support for 1b/3b models in ollama.
2024-10-03 11:07:53 -07:00
Xi Yan
b9b1e8b08b
[bugfix] conda path lookup ( #179 )
...
* fix conda lookup
* comments
2024-10-03 10:45:16 -07:00
raghotham
d74501f75c
Update README.md
...
Added pypi package version
2024-10-03 10:21:16 -07:00
Ashwin Bharambe
c02a90e4c8
Bump version to 0.0.38
2024-10-03 05:42:47 -07:00
Ashwin Bharambe
e9f6150588
A bit cleanup to avoid breakages
2024-10-02 21:31:09 -07:00
Ashwin Bharambe
988a9cada3
Don't ask for Api.inspect in stack build
2024-10-02 21:10:56 -07:00
Ashwin Bharambe
19ce6bf009
Don't validate prompt-guard anymore
2024-10-02 20:43:57 -07:00
Xi Yan
703ab9385f
fix routing table key list
2024-10-02 18:23:31 -07:00
Ashwin Bharambe
8d049000e3
Add an introspection "Api.inspect" API
2024-10-02 15:41:14 -07:00
Adrian Cole
01d93be948
Adds markdown-link-check and fixes a broken link ( #165 )
...
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-10-02 14:26:20 -07:00
Ashwin Bharambe
fe4aabd690
provider_id => provider_type, adapter_id => adapter_type
2024-10-02 14:05:59 -07:00
Ashwin Bharambe
df68db644b
Refactoring distribution/distribution.py
...
This file was becoming too large and unclear what it housed. Split it
into pieces.
2024-10-02 14:03:02 -07:00
Ashwin Bharambe
546f05bd3f
No automatic pager
2024-10-02 12:26:09 -07:00
Russell Bryant
204eb6d810
docker: Check for selinux before using --security-opt
( #167 )
...
Before using `--security-opt label=disable`, check that SELinux is
enabled. Otherwise, the option is not relevant.
This fixes errors on Mac.
Closes #166
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-02 10:37:41 -07:00
Ashwin Bharambe
9b93ee2c2b
Bump version to 0.0.37
2024-10-02 10:15:08 -07:00