Xi Yan
cb77426fb5
fix fireworks ( #427 )
2024-11-12 12:15:55 -05:00
Xi Yan
ec4fcad5ca
fix eval task registration ( #426 )
...
* fix eval tasks
* fix eval tasks
* fix eval tests
2024-11-12 11:51:34 -05:00
Xi Yan
84c6fbbd93
fix tests after registration migration & rename meta-reference -> basic / llm_as_judge provider ( #424 )
...
* rename meta-reference -> basic
* config rename
* impl rename
* rename llm_as_judge, fix test
* util
* rebase
* naming fix
2024-11-12 10:35:44 -05:00
Ashwin Bharambe
3d7561e55c
Rename all inline providers with an inline:: prefix ( #423 )
2024-11-11 22:19:16 -08:00
Ashwin Bharambe
f4426f6a43
Fix bug in llama stack build
; SERVER_DEPENDENCIES were dropped
2024-11-11 20:12:13 -08:00
Ashwin Bharambe
506b99242a
Allow specifying TEST / PYPI VERSION for docker name
2024-11-11 19:56:42 -08:00
Ashwin Bharambe
36da9a600e
add explicit platform
2024-11-11 19:30:15 -08:00
Ashwin Bharambe
218803b7c8
add pypi version to docker tag
2024-11-11 19:20:31 -08:00
Ashwin Bharambe
343458479d
Make sure TEST_PYPI_VERSION is used in docker builds
2024-11-11 18:40:13 -08:00
Ashwin Bharambe
285cd26fb2
Replace colon in path so it doesn't cause issue on Windows
2024-11-11 17:33:53 -08:00
Dinesh Yeduguru
0a3b3d5fb6
migrate scoring fns to resource ( #422 )
...
* fix after rebase
* remove print
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-11 17:28:48 -08:00
Dinesh Yeduguru
3802edfc50
migrate evals to resource ( #421 )
...
* migrate evals to resource
* remove listing of providers's evals
* change the order of params in register
* fix after rebase
* linter fix
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-11 17:24:03 -08:00
Dinesh Yeduguru
b95cb5308f
migrate dataset to resource ( #420 )
...
* migrate dataset to resource
* remove auto discovery
* remove listing of providers's datasets
* fix after rebase
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-11 17:14:41 -08:00
Dinesh Yeduguru
38cce97597
migrate memory banks to Resource and new registration ( #411 )
...
* migrate memory banks to Resource and new registration
* address feedback
* address feedback
* fix tests
* pgvector fix
* pgvector fix v2
* remove auto discovery
* change register signature to make params required
* update client
* client fix
* use annotated union to parse
* remove base MemoryBank inheritence
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-11 17:10:44 -08:00
Xi Yan
b4416b72fd
Folder restructure for evals/datasets/scoring ( #419 )
...
* rename evals related stuff
* fix datasetio
* fix scoring test
* localfs -> LocalFS
* refactor scoring
* refactor scoring
* remove 8b_correctness scoring_fn from tests
* tests w/ eval params
* scoring fn braintrust fixture
* import
2024-11-11 17:35:40 -05:00
Xi Yan
2b7d70ba86
[Evals API][11/n] huggingface dataset provider + mmlu scoring fn ( #392 )
...
* wip
* scoring fn api
* eval api
* eval task
* evaluate api update
* pre commit
* unwrap context -> config
* config field doc
* typo
* naming fix
* separate benchmark / app eval
* api name
* rename
* wip tests
* wip
* datasetio test
* delete unused
* fixture
* scoring resolve
* fix scoring register
* scoring test pass
* score batch
* scoring fix
* fix eval
* test eval works
* huggingface provider
* datasetdef files
* mmlu scoring fn
* test wip
* remove type ignore
* api refactor
* add default task_eval_id for routing
* add eval_id for jobs
* remove type ignore
* huggingface provider
* wip huggingface register
* only keep 1 run_eval
* fix optional
* register task required
* register task required
* delete old tests
* fix
* mmlu loose
* refactor
* msg
* fix tests
* move benchmark task def to file
* msg
* gen openapi
* openapi gen
* move dataset to hf llamastack repo
* remove todo
* refactor
* add register model to unit test
* rename
* register to client
* delete preregistered dataset/eval task
* comments
* huggingface -> remote adapter
* openapi gen
2024-11-11 14:49:50 -05:00
Ashwin Bharambe
c1f7ba3aed
Split safety into (llama-guard, prompt-guard, code-scanner) ( #400 )
...
Splits the meta-reference safety implementation into three distinct providers:
- inline::llama-guard
- inline::prompt-guard
- inline::code-scanner
Note that this PR is a backward incompatible change to the llama stack server. I have added deprecation_error field to ProviderSpec -- the server reads it and immediately barfs. This is used to direct the user with a specific message on what action to perform. An automagical "config upgrade" is a bit too much work to implement right now :/
(Note that we will be gradually prefixing all inline providers with inline:: -- I am only doing this for this set of new providers because otherwise existing configuration files will break even more badly.)
2024-11-11 09:29:18 -08:00
Ashwin Bharambe
4986e46188
Distributions updates (slight updates to ollama, add inline-vllm and remote-vllm) ( #408 )
...
* remote vllm distro
* add inline-vllm details, fix things
* Write some docs
2024-11-08 18:09:39 -08:00
Xi Yan
ba82021d4b
precommit
2024-11-08 17:58:58 -08:00
Xi Yan
1ebf6447c5
add missing inits
2024-11-08 17:54:24 -08:00
Xi Yan
89c3129f0b
add missing inits
2024-11-08 17:49:29 -08:00
Dinesh Yeduguru
ec644d3418
migrate model to Resource and new registration signature ( #410 )
...
* resource oriented object design for models
* add back llama_model field
* working tests
* register singature fix
* address feedback
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 16:12:57 -08:00
Dalton Flanagan
5625aef48a
Add pip install helper for test and direct scenarios ( #404 )
...
* initial branch commit
* pip install helptext
* remove print
* pre-commit
2024-11-08 15:18:21 -05:00
Dinesh Yeduguru
d800a16acd
Resource oriented design for shields ( #399 )
...
* init
* working bedrock tests
* bedrock test for inference fixes
* use env vars for bedrock guardrail vars
* add register in meta reference
* use correct shield impl in meta ref
* dont add together fixture
* right naming
* minor updates
* improved registration flow
* address feedback
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 12:16:11 -08:00
Xi Yan
6192bf43a4
[Evals API][10/n] API updates for EvalTaskDef + new test migration ( #379 )
...
* wip
* scoring fn api
* eval api
* eval task
* evaluate api update
* pre commit
* unwrap context -> config
* config field doc
* typo
* naming fix
* separate benchmark / app eval
* api name
* rename
* wip tests
* wip
* datasetio test
* delete unused
* fixture
* scoring resolve
* fix scoring register
* scoring test pass
* score batch
* scoring fix
* fix eval
* test eval works
* remove type ignore
* api refactor
* add default task_eval_id for routing
* add eval_id for jobs
* remove type ignore
* only keep 1 run_eval
* fix optional
* register task required
* register task required
* delete old tests
* delete old tests
* fixture return impl
2024-11-07 21:24:12 -08:00
Dalton Flanagan
345ae07317
Factor out create_dist_registry ( #398 )
2024-11-07 16:13:19 -05:00
Ashwin Bharambe
694c142b89
Add provider deprecation support; change directory structure ( #397 )
...
* Add provider deprecation support; change directory structure
* fix a couple dangling imports
* move the meta_reference safety dir also
2024-11-07 13:04:53 -08:00
Xi Yan
36e2538eb0
fix together inference validator ( #393 )
2024-11-07 11:31:53 -08:00
Yufei (Benny) Chen
31c5fbda5e
[LlamaStack][Fireworks] Update client and add unittest ( #390 )
2024-11-07 10:11:28 -08:00
Ashwin Bharambe
489f74a70b
Allow simpler initialization of RemoteProviderConfig
; fix issue in httpx client
2024-11-06 19:19:26 -08:00
Ashwin Bharambe
064d2a5287
Remove the safety adapter for Together; we can just use "meta-reference" ( #387 )
2024-11-06 17:36:57 -08:00
Xi Yan
8fc2d212a2
fix safety signature mismatch ( #388 )
...
* fix safety sig
* shield_type->identifier
2024-11-06 16:30:47 -08:00
Ashwin Bharambe
7c340f0236
rename test_inference -> test_text_inference
2024-11-06 16:12:50 -08:00
Ashwin Bharambe
3b54ce3499
remote::vllm now works with vision models
2024-11-06 16:07:17 -08:00
Ashwin Bharambe
994732e2e0
impls
-> inline
, adapters
-> remote
(#381 )
2024-11-06 14:54:05 -08:00
Ashwin Bharambe
b10e9f46bb
Enable remote::vllm ( #384 )
...
* Enable remote::vllm
* Kill the giant list of hard coded models
2024-11-06 14:42:44 -08:00
Dinesh Yeduguru
093c9f1987
add bedrock distribution code ( #358 )
...
* add bedrock distribution code
* fix linter error
* add bedrock shields support
* linter fixes
* working bedrock safety
* change to return only one violation
* remove env var reading
* refereshable boto credentials
* remove env vars
* address raghu's feedback
* fix session_ttl passing
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-06 14:39:11 -08:00
Dinesh Yeduguru
6ebd553da5
fix routing tables look up key for memory bank ( #383 )
...
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-06 13:32:46 -08:00
Xi Yan
748606195b
Kill llama stack configure
( #371 )
...
* remove configure
* build msg
* wip
* build->run
* delete prints
* docs
* fix docs, kill configure
* precommit
* update fireworks build
* docs
* clean up build
* comments
* fix
* test
* remove baking build.yaml into docker
* fix msg, urls
* configure msg
2024-11-06 13:32:10 -08:00
Ashwin Bharambe
d289afdbde
Fix exception in server when client SSE connection closes
2024-11-06 11:00:34 -08:00
Ashwin Bharambe
cde9bc1388
Enable vision models for (Together, Fireworks, Meta-Reference, Ollama) ( #376 )
...
* Enable vision models for Together and Fireworks
* Works with ollama 0.4.0 pre-release with the vision model
* localize media for meta_reference inference
* Fix
2024-11-05 16:22:33 -08:00
Dinesh Yeduguru
4dd01eeaa1
fix postgres config validation ( #380 )
...
* fix postgres config validation
* dont remove types
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-05 15:09:04 -08:00
Dinesh Yeduguru
a2351bf2e9
add ability to persist memory banks created for faiss ( #375 )
...
* init
* add tests
* fix tests'
* more fixes
* add tests
* make the default path more faiss specific
* fix linter
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-05 14:50:23 -08:00
Dinesh Yeduguru
dcd8cfe0f3
add postgres kvstoreimpl ( #374 )
...
* add postgres kvstoreimpl
* make table name configurable
* add validator for table name
* linter fix
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-05 11:42:21 -08:00
Steve Grubb
122793ab92
Correct a traceback in vllm ( #366 )
...
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/adapters/inference/vllm/vllm.py", line 136, in _stream_chat_completion
async for chunk in process_chat_completion_stream_response(
TypeError: process_chat_completion_stream_response() takes 2 positional arguments but 3 were given
This corrects the error by deleting the request variable
2024-11-04 20:49:35 -08:00
Ashwin Bharambe
a81178f1f5
The server now depends on SQLite by default
2024-11-04 20:35:53 -08:00
Ashwin Bharambe
9a57a009ee
Need to await for get_object_from_identifier() now
2024-11-04 20:33:12 -08:00
Ashwin Bharambe
7cf4c905f3
add support for remote providers in tests
2024-11-04 20:30:46 -08:00
Ashwin Bharambe
0763a0b85f
Fix for the fix!
2024-11-04 20:06:01 -08:00
Ashwin Bharambe
fb2678b134
Fix shield_type and routing table breakage
2024-11-04 19:57:15 -08:00