llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 02:03:44 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	eccd7dc4a9	Avoid warnings from pydantic for overriding schema Also fix structured output in completions	2024-10-28 21:39:48 -07:00
Xi Yan	ed833bb758	[Evals API][7/n] braintrust scoring provider (#333 ) * wip scoring refactor * llm as judge, move folders * test full generation + eval * extract score regex to llm context * remove prints, cleanup braintrust in this branch * braintrust skeleton * datasetio test fix * braintrust provider * remove prints * dependencies * change json -> class * json -> class * remove initialize * address nits * check identifier prefix * braintrust scoring identifier check, rebase * udpate MANIFEST * manifest * remove braintrust scoring_fn * remove comments * tests * imports fix	2024-10-28 18:59:35 -07:00
Xi Yan	ae671eaf7a	distro readmes with model serving instructions (#339 ) * readme updates * quantied compose * dell tgi * config update * readme * update model serving readmes * update * update * config	2024-10-28 17:47:14 -07:00
Xi Yan	a70a4706fc	update distributions compose/readme (#338 ) * readme updates * quantied compose * dell tgi * config update	2024-10-28 16:34:43 -07:00
Xi Yan	985ff4d6ce	update distributions/readmes	2024-10-28 15:10:40 -07:00
Xi Yan	7b8748c53e	[Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330 ) * wip scoring refactor * llm as judge, move folders * test full generation + eval * extract score regex to llm context * remove prints, cleanup braintrust in this branch * change json -> class * remove initialize * address nits * check identifier prefix * udpate MANIFEST	2024-10-28 14:08:42 -07:00
Xi Yan	04a4784287	Update README.md	2024-10-28 13:25:44 -07:00
Xi Yan	3fa1eaf37d	Update README.md	2024-10-28 13:18:55 -07:00
Xi Yan	0d4215e125	Update README.md	2024-10-28 13:18:34 -07:00
Xi Yan	8f5a850de9	Update README.md	2024-10-28 13:16:23 -07:00
Xi Yan	ffb3965ade	remove Field for return_type	2024-10-28 13:04:41 -07:00
Ashwin Bharambe	b7d2b83d55	Allow passing provider_registry to resolve_impls()	2024-10-28 11:58:16 -07:00
Ashwin Bharambe	8a3b64d1be	Bump version to 0.0.47	2024-10-27 22:30:38 -07:00
Xi Yan	46bb8884a7	distributions readme typos	2024-10-27 11:57:21 -07:00
Dalton Flanagan	44c05c6e7d	add vision instruct models for fireworks	2024-10-27 17:54:54 +00:00
Dinesh Yeduguru	9b85d9a841	completion() for fireworks (#329 )	2024-10-25 16:12:10 -07:00
Dinesh Yeduguru	7ec79f3b9d	completion() for together (#324 ) * completion() for together * test fixes * fix client building	2024-10-25 14:21:12 -07:00
Xi Yan	8a74e400d6	Update getting_started.md	2024-10-25 13:30:33 -07:00
Xi Yan	f168752bba	Update getting_started.md	2024-10-25 13:27:43 -07:00
Xi Yan	abdf7cddf3	[Evals API][4/n] evals with generation meta-reference impl (#303 ) * wip * dataset validation * test_scoring * cleanup * clean up test * comments * error checking * dataset client * test client: * datasetio client * clean up * basic scoring function works * scorer wip * equality scorer * score batch impl * score batch * update scoring test * refactor * validate scorer input * address comments * evals with generation * add all rows scores to ScoringResult * minor typing * bugfix * scoring function def rename * rebase name * refactor * address comments * Update iOS inference instructions for new quantization * Small updates to quantization config * Fix score threshold in faiss * Bump version to 0.0.45 * Handle both ipv6 and ipv4 interfaces together * update manifest for build templates * Update getting_started.md * chatcompletion & completion input type validation * inclusion->subsetof * error checking * scoring_function -> scoring_fn rename, scorer -> scoring_fn rename * address comments * [Evals API][5/n] fixes to generate openapi spec (#323) * generate openapi * typing comment, dataset -> dataset_id * remove custom type * sample eval run.yaml --------- Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-10-25 13:12:39 -07:00
Ashwin Bharambe	426d821e7f	Bump version to 0.0.46	2024-10-25 13:10:55 -07:00
Sachin Mehta	c05fbf14b3	Added hadamard transform for spinquant (#326 ) * Added hadamard transform for spinquant * Changed from config to model_args * Added an assertion for model args * Use enum.value to check against str * pre-commit --------- Co-authored-by: Sachin Mehta <sacmehta@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2024-10-25 12:58:48 -07:00
Xi Yan	07f9bf723f	fix broken --list-templates with adding build.yaml files for packaging (#327 ) * add build files to templates * fix templates * manifest * symlink * symlink * precommit * change everything to docker build.yaml * remove image_type in templates * fix build from templates CLI * fix readmes	2024-10-25 12:51:22 -07:00
Ashwin Bharambe	afae4e3d8e	Update docker build flow a little	2024-10-25 10:06:21 -07:00
Ashwin Bharambe	5bed6c276c	Move function around	2024-10-25 09:18:22 -07:00
Ashwin Bharambe	a387ca22e2	Update docker_base for meta-reference-gpu	2024-10-25 09:13:33 -07:00
Ashwin Bharambe	70d59b0f5d	Make vllm inference better Tests still don't pass completely (some hang) so I think there are some potential threading issues maybe	2024-10-24 22:52:47 -07:00
Xi Yan	cb43caa2c3	start_container.sh prefix llamastack->distribution name	2024-10-24 21:29:17 -07:00
Sarthak Deshpande	df141b6ef3	Fix for get_agents_session (#300 )	2024-10-24 18:36:27 -07:00
Justin Lee	b6d8246b82	added templates and enhanced readme (#307 ) Co-authored-by: Justin Lee <justinai@fb.com>	2024-10-24 17:07:06 -07:00
Dinesh Yeduguru	3e1c3fdb3f	completion() for tgi (#295 )	2024-10-24 16:02:41 -07:00
Xi Yan	cb84034567	[Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296 ) * wip * dataset validation * test_scoring * cleanup * clean up test * comments * error checking * dataset client * test client: * datasetio client * clean up * basic scoring function works * scorer wip * equality scorer * score batch impl * score batch * update scoring test * refactor * validate scorer input * address comments * add all rows scores to ScoringResult * bugfix * scoring function def rename	2024-10-24 14:52:30 -07:00
Xi Yan	e70420a06e	Update getting_started.md	2024-10-24 14:19:35 -07:00
Xi Yan	8615bc9e08	update manifest for build templates	2024-10-24 14:04:13 -07:00
Ashwin Bharambe	94728d6983	Handle both ipv6 and ipv4 interfaces together	2024-10-24 13:59:01 -07:00
Ashwin Bharambe	0538cc297e	Bump version to 0.0.45	2024-10-24 12:14:18 -07:00
Ashwin Bharambe	205bcfdd4e	Fix score threshold in faiss	2024-10-24 12:11:58 -07:00
Ashwin Bharambe	161aef0aae	Small updates to quantization config	2024-10-24 12:08:56 -07:00
Dalton Flanagan	8eceebec98	Update iOS inference instructions for new quantization	2024-10-24 14:47:27 -04:00
Ashwin Bharambe	8aa8847b4a	Bump version to 0.0.44	2024-10-24 08:41:39 -07:00
Ashwin Bharambe	7afe51c84d	New quantized models (#301 )	2024-10-24 08:38:56 -07:00
Ashwin Bharambe	05a8d47b98	Add a meta-reference-quantized-gpu distribution	2024-10-23 21:45:50 -07:00
Xi Yan	f5dcc03742	use pytorch/pytorch as base	2024-10-23 20:22:00 -07:00
Xi Yan	0cec86453b	Fix issue w/ routing_table api getting added when router api is not specified (#298 ) * fix issue w/ enforcing api * cleanup * inference only yaml	2024-10-23 15:27:22 -07:00
Dinesh Yeduguru	21f2e9adf5	dont set num_predict for all providers (#294 )	2024-10-23 11:44:04 -07:00
Ashwin Bharambe	ffb561070d	Support structured output for Together (#289 )	2024-10-22 22:36:38 -07:00
Sarthak Deshpande	2e5e46d896	Added tests for persistence (#274 )	2024-10-22 19:41:46 -07:00
Xi Yan	821810657f	[Evals API][2/n] datasets / datasetio meta-reference implementation (#288 ) * skeleton dataset / datasetio * dataset datasetio * config * address comments * delete dataset_utils * address comments * naming fix	2024-10-22 16:12:16 -07:00
Sarthak Deshpande	8a01b9e40c	Added implementations for get_agents_session, delete_agents_session and delete_agents (#267 )	2024-10-22 13:50:43 -07:00
Suraj Subramanian	b81a3bd46a	Fix import conflict for SamplingParams (#285 ) Conflict between llama_models.llama3.api.datatypes.SamplingParams and vllm.sampling_params.SamplingParams results in errors while processing VLLM engine requests	2024-10-22 12:56:00 -07:00

... 3 4 5 6 7 ...

585 commits