llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-12 04:00:42 +00:00

Author	SHA1	Message	Date
Xi Yan	32a496ab0f	Merge branch 'evals_5' into evals_6	2024-10-24 12:01:41 -07:00
Xi Yan	a3a8f32541	add all rows scores to ScoringResult	2024-10-24 11:53:15 -07:00
Xi Yan	737fcb795f	evals with generation	2024-10-24 11:30:13 -07:00
Xi Yan	071dba8871	Merge branch 'main' into evals_5	2024-10-24 09:18:15 -07:00
Ashwin Bharambe	8aa8847b4a	Bump version to 0.0.44	2024-10-24 08:41:39 -07:00
Ashwin Bharambe	7afe51c84d	New quantized models (#301 )	2024-10-24 08:38:56 -07:00
Xi Yan	afa0c2b146	address comments	2024-10-23 22:17:38 -07:00
Ashwin Bharambe	05a8d47b98	Add a meta-reference-quantized-gpu distribution	2024-10-23 21:45:50 -07:00
Xi Yan	f5dcc03742	use pytorch/pytorch as base	2024-10-23 20:22:00 -07:00
Xi Yan	59c93548bc	validate scorer input	2024-10-23 17:43:41 -07:00
Xi Yan	0ee82571a8	refactor	2024-10-23 17:30:10 -07:00
Xi Yan	7c803cef86	update scoring test	2024-10-23 17:22:48 -07:00
Xi Yan	3c6555c408	score batch	2024-10-23 16:38:00 -07:00
Xi Yan	eb572faf6f	score batch impl	2024-10-23 16:19:25 -07:00
Xi Yan	4b1d7da030	equality scorer	2024-10-23 16:07:17 -07:00
Xi Yan	cad8c8710b	Merge branch 'main' into evals_5	2024-10-23 15:33:36 -07:00
Xi Yan	caf253e08f	Merge branch 'main' into evals_5	2024-10-23 15:33:00 -07:00
Xi Yan	0cec86453b	Fix issue w/ routing_table api getting added when router api is not specified (#298 ) * fix issue w/ enforcing api * cleanup * inference only yaml	2024-10-23 15:27:22 -07:00
Xi Yan	35981a1a3b	scorer wip	2024-10-23 15:02:54 -07:00
Xi Yan	70c08e694d	basic scoring function works	2024-10-23 14:42:28 -07:00
Xi Yan	38e31ab525	clean up	2024-10-23 14:08:21 -07:00
Xi Yan	5930a92dc7	datasetio client	2024-10-23 14:04:51 -07:00
Xi Yan	51d5ad67c4	test client:	2024-10-23 13:55:55 -07:00
Xi Yan	bb43369521	dataset client	2024-10-23 13:53:58 -07:00
Xi Yan	c5db025320	error checking	2024-10-23 13:17:47 -07:00
Xi Yan	d8bbce6f7c	comments	2024-10-23 13:16:08 -07:00
Xi Yan	5e1323b5bf	clean up test	2024-10-23 13:08:42 -07:00
Xi Yan	555f6e1531	cleanup	2024-10-23 13:07:15 -07:00
Xi Yan	92e32f80ad	test_scoring	2024-10-23 13:01:49 -07:00
Xi Yan	7c280e18fb	dataset validation	2024-10-23 12:08:39 -07:00
Dinesh Yeduguru	21f2e9adf5	dont set num_predict for all providers (#294 )	2024-10-23 11:44:04 -07:00
Ashwin Bharambe	ffb561070d	Support structured output for Together (#289 )	2024-10-22 22:36:38 -07:00
Xi Yan	aefa84e70a	wip	2024-10-22 20:00:43 -07:00
Sarthak Deshpande	2e5e46d896	Added tests for persistence (#274 )	2024-10-22 19:41:46 -07:00
Xi Yan	821810657f	[Evals API][2/n] datasets / datasetio meta-reference implementation (#288 ) * skeleton dataset / datasetio * dataset datasetio * config * address comments * delete dataset_utils * address comments * naming fix	2024-10-22 16:12:16 -07:00
Sarthak Deshpande	8a01b9e40c	Added implementations for get_agents_session, delete_agents_session and delete_agents (#267 )	2024-10-22 13:50:43 -07:00
Suraj Subramanian	b81a3bd46a	Fix import conflict for SamplingParams (#285 ) Conflict between llama_models.llama3.api.datatypes.SamplingParams and vllm.sampling_params.SamplingParams results in errors while processing VLLM engine requests	2024-10-22 12:56:00 -07:00
Ashwin Bharambe	c06718fbd5	Add support for Structured Output / Guided decoding (#281 ) Added support for structured output in the API and added a reference implementation for meta-reference. A few notes: * Two formats are specified in the API: Json schema and EBNF based grammar * Implementation only supports Json for now We use lm-format-enhancer to provide the implementation right now but may change this especially because BNF grammars aren't supported by that library. Fireworks has support for structured output and Together has limited supported for it too. Subsequent PRs will add these changes. We would like all our inference providers to provide structured output for llama models since it is an extremely important and highly sought-after need by the developers.	2024-10-22 12:53:34 -07:00
Anush	4c3d33e6f4	feat: Qdrant Vector index support (#221 ) This PR adds support for Qdrant - https://qdrant.tech/ to be used as a vector memory. I've unit-tested the methods to confirm that they work as intended. To run Qdrant ``` docker run -p 6333:6333 qdrant/qdrant ```	2024-10-22 12:50:19 -07:00
Suraj Subramanian	668a495aba	Add REST api example for chat_completion (#286 )	2024-10-22 10:35:20 -07:00
Xi Yan	e45f121c77	[Evals API] [1/n] Initial API (#287 ) * type system api * datasets api * fix * datasetio api * kill reward scoring * scoring functions + evals * move jobs, fix errors	2024-10-22 09:31:19 -07:00
Xi Yan	b279d3bc58	Update README.md	2024-10-22 08:01:33 -07:00
Dinesh Yeduguru	1d241bf3fe	add completion() for ollama (#280 )	2024-10-21 22:26:33 -07:00
raghotham	e2a5a2e10d	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
Xi Yan	dbb5ce43fc	Bump version to 0.0.43	2024-10-21 19:10:01 -07:00
Xi Yan	a2ff74a686	telemetry WARNING->WARN fix	2024-10-21 18:52:48 -07:00
Xi Yan	b1451afbc8	Update README.md	2024-10-21 18:21:30 -07:00
Xi Yan	4d2bd2d39e	add more distro templates (#279 ) * verify dockers * together distro verified * readme * fireworks distro * fireworks compose up * fireworks verified	2024-10-21 18:15:08 -07:00
Xi Yan	cf27d19dd5	fix sse_generator async	2024-10-21 14:03:42 -07:00
Ashwin Bharambe	1944405dca	Update new_api_provider.md	2024-10-21 14:02:51 -07:00

1 2 3 4 5 ...

372 commits