llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-27 18:50:41 +00:00

Author	SHA1	Message	Date
Dinesh Yeduguru	0a3b3d5fb6	migrate scoring fns to resource (#422 ) * fix after rebase * remove print --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-11 17:28:48 -08:00
Dinesh Yeduguru	3802edfc50	migrate evals to resource (#421 ) * migrate evals to resource * remove listing of providers's evals * change the order of params in register * fix after rebase * linter fix --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-11 17:24:03 -08:00
Dinesh Yeduguru	b95cb5308f	migrate dataset to resource (#420 ) * migrate dataset to resource * remove auto discovery * remove listing of providers's datasets * fix after rebase --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-11 17:14:41 -08:00
Dinesh Yeduguru	38cce97597	migrate memory banks to Resource and new registration (#411 ) * migrate memory banks to Resource and new registration * address feedback * address feedback * fix tests * pgvector fix * pgvector fix v2 * remove auto discovery * change register signature to make params required * update client * client fix * use annotated union to parse * remove base MemoryBank inheritence --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-11 17:10:44 -08:00
Xi Yan	b4416b72fd	Folder restructure for evals/datasets/scoring (#419 ) * rename evals related stuff * fix datasetio * fix scoring test * localfs -> LocalFS * refactor scoring * refactor scoring * remove 8b_correctness scoring_fn from tests * tests w/ eval params * scoring fn braintrust fixture * import	2024-11-11 17:35:40 -05:00
Xi Yan	2b7d70ba86	[Evals API][11/n] huggingface dataset provider + mmlu scoring fn (#392 ) * wip * scoring fn api * eval api * eval task * evaluate api update * pre commit * unwrap context -> config * config field doc * typo * naming fix * separate benchmark / app eval * api name * rename * wip tests * wip * datasetio test * delete unused * fixture * scoring resolve * fix scoring register * scoring test pass * score batch * scoring fix * fix eval * test eval works * huggingface provider * datasetdef files * mmlu scoring fn * test wip * remove type ignore * api refactor * add default task_eval_id for routing * add eval_id for jobs * remove type ignore * huggingface provider * wip huggingface register * only keep 1 run_eval * fix optional * register task required * register task required * delete old tests * fix * mmlu loose * refactor * msg * fix tests * move benchmark task def to file * msg * gen openapi * openapi gen * move dataset to hf llamastack repo * remove todo * refactor * add register model to unit test * rename * register to client * delete preregistered dataset/eval task * comments * huggingface -> remote adapter * openapi gen	2024-11-11 14:49:50 -05:00
Ashwin Bharambe	c1f7ba3aed	Split safety into (llama-guard, prompt-guard, code-scanner) (#400 ) Splits the meta-reference safety implementation into three distinct providers: - inline::llama-guard - inline::prompt-guard - inline::code-scanner Note that this PR is a backward incompatible change to the llama stack server. I have added deprecation_error field to ProviderSpec -- the server reads it and immediately barfs. This is used to direct the user with a specific message on what action to perform. An automagical "config upgrade" is a bit too much work to implement right now :/ (Note that we will be gradually prefixing all inline providers with inline:: -- I am only doing this for this set of new providers because otherwise existing configuration files will break even more badly.)	2024-11-11 09:29:18 -08:00
Ashwin Bharambe	4986e46188	Distributions updates (slight updates to ollama, add inline-vllm and remote-vllm) (#408 ) * remote vllm distro * add inline-vllm details, fix things * Write some docs	2024-11-08 18:09:39 -08:00
Xi Yan	ba82021d4b	precommit	2024-11-08 17:58:58 -08:00
Xi Yan	1ebf6447c5	add missing inits	2024-11-08 17:54:24 -08:00
Xi Yan	89c3129f0b	add missing inits	2024-11-08 17:49:29 -08:00
Dinesh Yeduguru	ec644d3418	migrate model to Resource and new registration signature (#410 ) * resource oriented object design for models * add back llama_model field * working tests * register singature fix * address feedback --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-08 16:12:57 -08:00
Dalton Flanagan	5625aef48a	Add pip install helper for test and direct scenarios (#404 ) * initial branch commit * pip install helptext * remove print * pre-commit	2024-11-08 15:18:21 -05:00
Dinesh Yeduguru	d800a16acd	Resource oriented design for shields (#399 ) * init * working bedrock tests * bedrock test for inference fixes * use env vars for bedrock guardrail vars * add register in meta reference * use correct shield impl in meta ref * dont add together fixture * right naming * minor updates * improved registration flow * address feedback --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-08 12:16:11 -08:00
Xi Yan	6192bf43a4	[Evals API][10/n] API updates for EvalTaskDef + new test migration (#379 ) * wip * scoring fn api * eval api * eval task * evaluate api update * pre commit * unwrap context -> config * config field doc * typo * naming fix * separate benchmark / app eval * api name * rename * wip tests * wip * datasetio test * delete unused * fixture * scoring resolve * fix scoring register * scoring test pass * score batch * scoring fix * fix eval * test eval works * remove type ignore * api refactor * add default task_eval_id for routing * add eval_id for jobs * remove type ignore * only keep 1 run_eval * fix optional * register task required * register task required * delete old tests * delete old tests * fixture return impl	2024-11-07 21:24:12 -08:00
Dalton Flanagan	345ae07317	Factor out create_dist_registry (#398 )	2024-11-07 16:13:19 -05:00
Ashwin Bharambe	694c142b89	Add provider deprecation support; change directory structure (#397 ) * Add provider deprecation support; change directory structure * fix a couple dangling imports * move the meta_reference safety dir also	2024-11-07 13:04:53 -08:00
Xi Yan	36e2538eb0	fix together inference validator (#393 )	2024-11-07 11:31:53 -08:00
Yufei (Benny) Chen	31c5fbda5e	[LlamaStack][Fireworks] Update client and add unittest (#390 )	2024-11-07 10:11:28 -08:00
Ashwin Bharambe	489f74a70b	Allow simpler initialization of `RemoteProviderConfig`; fix issue in httpx client	2024-11-06 19:19:26 -08:00
Ashwin Bharambe	064d2a5287	Remove the safety adapter for Together; we can just use "meta-reference" (#387 )	2024-11-06 17:36:57 -08:00
Xi Yan	8fc2d212a2	fix safety signature mismatch (#388 ) * fix safety sig * shield_type->identifier	2024-11-06 16:30:47 -08:00
Ashwin Bharambe	7c340f0236	rename test_inference -> test_text_inference	2024-11-06 16:12:50 -08:00
Ashwin Bharambe	3b54ce3499	remote::vllm now works with vision models	2024-11-06 16:07:17 -08:00
Ashwin Bharambe	994732e2e0	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00
Ashwin Bharambe	b10e9f46bb	Enable remote::vllm (#384 ) * Enable remote::vllm * Kill the giant list of hard coded models	2024-11-06 14:42:44 -08:00
Dinesh Yeduguru	093c9f1987	add bedrock distribution code (#358 ) * add bedrock distribution code * fix linter error * add bedrock shields support * linter fixes * working bedrock safety * change to return only one violation * remove env var reading * refereshable boto credentials * remove env vars * address raghu's feedback * fix session_ttl passing --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-06 14:39:11 -08:00
Dinesh Yeduguru	6ebd553da5	fix routing tables look up key for memory bank (#383 ) Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-06 13:32:46 -08:00
Xi Yan	748606195b	Kill `llama stack configure` (#371 ) * remove configure * build msg * wip * build->run * delete prints * docs * fix docs, kill configure * precommit * update fireworks build * docs * clean up build * comments * fix * test * remove baking build.yaml into docker * fix msg, urls * configure msg	2024-11-06 13:32:10 -08:00
Ashwin Bharambe	d289afdbde	Fix exception in server when client SSE connection closes	2024-11-06 11:00:34 -08:00
Ashwin Bharambe	cde9bc1388	Enable vision models for (Together, Fireworks, Meta-Reference, Ollama) (#376 ) * Enable vision models for Together and Fireworks * Works with ollama 0.4.0 pre-release with the vision model * localize media for meta_reference inference * Fix	2024-11-05 16:22:33 -08:00
Dinesh Yeduguru	4dd01eeaa1	fix postgres config validation (#380 ) * fix postgres config validation * dont remove types --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-05 15:09:04 -08:00
Dinesh Yeduguru	a2351bf2e9	add ability to persist memory banks created for faiss (#375 ) * init * add tests * fix tests' * more fixes * add tests * make the default path more faiss specific * fix linter --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-05 14:50:23 -08:00
Dinesh Yeduguru	dcd8cfe0f3	add postgres kvstoreimpl (#374 ) * add postgres kvstoreimpl * make table name configurable * add validator for table name * linter fix --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-05 11:42:21 -08:00
Steve Grubb	122793ab92	Correct a traceback in vllm (#366 ) File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/adapters/inference/vllm/vllm.py", line 136, in _stream_chat_completion async for chunk in process_chat_completion_stream_response( TypeError: process_chat_completion_stream_response() takes 2 positional arguments but 3 were given This corrects the error by deleting the request variable	2024-11-04 20:49:35 -08:00
Ashwin Bharambe	a81178f1f5	The server now depends on SQLite by default	2024-11-04 20:35:53 -08:00
Ashwin Bharambe	9a57a009ee	Need to await for get_object_from_identifier() now	2024-11-04 20:33:12 -08:00
Ashwin Bharambe	7cf4c905f3	add support for remote providers in tests	2024-11-04 20:30:46 -08:00
Ashwin Bharambe	0763a0b85f	Fix for the fix!	2024-11-04 20:06:01 -08:00
Ashwin Bharambe	fb2678b134	Fix shield_type and routing table breakage	2024-11-04 19:57:15 -08:00
Ashwin Bharambe	ffedb81c11	Significantly simpler and malleable test setup (#360 ) * Significantly simpler and malleable test setup * convert memory tests * refactor fixtures and add support for composable fixtures * Fix memory to use the newer fixture organization * Get agents tests working * Safety tests work * yet another refactor to make this more general now it accepts --inference-model, --safety-model options also * get multiple providers working for meta-reference (for inference + safety) * Add README.md --------- Co-authored-by: Ashwin Bharambe <ashwin@meta.com>	2024-11-04 17:36:43 -08:00
Dinesh Yeduguru	663883cc29	persist registered objects with distribution (#354 ) * persist registered objects with distribution * linter fixes * comment * use annotate and field discriminator * workign tests * donot use global state * precommit failures fixed * add back Any * fix imports * remove unnecessary changes in ollama * precommit failures fixed * make kvstore configurable for dist and rename registry * add comment about registry list return * fix linter errors * use registry to hydrate * remove debug print * linter fixes * remove kvstore.db * rename distribution_registry_store --------- Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-04 17:25:06 -08:00
Dinesh Yeduguru	c9bf1d7d0b	pgvector fixes (#369 ) Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>	2024-11-04 17:01:09 -08:00
Xi Yan	c810a4184d	[docs] update documentations (#356 ) * move docs -> source * Add files via upload * mv image * Add files via upload * colocate iOS setup doc * delete image * Add files via upload * fix * delete image * Add files via upload * Update developer_cookbook.md * toctree * wip subfolder * docs update * subfolder * updates * name * updates * index * updates * refactor structure * depth * docs * content * docs * getting started * distributions * fireworks * fireworks * update * theme * theme * theme * pdj theme * pytorch theme * css * theme * agents example * format * index * headers * copy button * test tabs * test tabs * fix * tabs * tab * tabs * sphinx_design * quick start commands * size * width * css * css * download models * asthetic fix * tab format * update * css * width * css * docs * tab based * tab * tabs * docs * style * image * css * color * typo * update docs * missing links * list templates * links * links update * troubleshooting * fix * distributions * docs * fix table * kill llamastack-local-gpu/cpu * Update index.md * Update index.md * mv ios_setup.md * Update ios_setup.md * Add remote_or_local.gif * Update ios_setup.md * release notes * typos * Add ios_setup to index * nav bar * hide torctree * ios image * links update * rename * rename * docs * rename * links * distributions * distributions * distributions * distributions * remove release * remote --------- Co-authored-by: dltn <6599399+dltn@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-11-04 16:52:38 -08:00
Dinesh Yeduguru	ac93dd89cf	fix bedrock impl (#359 ) * fix bedrock impl * fix linter errors * fix return type and remove debug print	2024-11-03 07:32:30 -08:00
Ashwin Bharambe	bf4f97a2e1	Fix vLLM adapter chat_completion signature	2024-11-01 13:09:03 -07:00
Dalton Flanagan	adecb2a2d3	update for message parsing on ios	2024-11-01 14:37:19 -04:00
Ashwin Bharambe	37b330b4ef	add dynamic clients for all APIs (#348 ) * add dynamic clients for all APIs * fix openapi generator * inference + memory + agents tests now pass with "remote" providers * Add docstring which fixes openapi generator :/	2024-10-31 14:46:25 -07:00
Steve Grubb	f04b566c5c	Do not cache pip (#349 ) Pip has a 3.3GB cache of torch and friends. Do not keep this in the image.	2024-10-31 09:52:40 -07:00
Ashwin Bharambe	4aa1bf6a60	Kill --name from llama stack build (#340 )	2024-10-28 23:07:32 -07:00

... 20 21 22 23 24 ...

1278 commits