llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-08 19:10:56 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	238e658cdf	Kill irrelevant (now) method	2024-10-09 22:11:18 -07:00
Ashwin Bharambe	77a486f176	added tool calling test	2024-10-09 22:01:28 -07:00
Ashwin Bharambe	ef4b74c935	Add a simple agents test case	2024-10-09 21:52:49 -07:00
Ashwin Bharambe	2d94ca71a9	Pass memory bank API to agent impl	2024-10-09 21:16:57 -07:00
Ashwin Bharambe	6788173ffc	re-gen openapi spec	2024-10-09 21:13:11 -07:00
Ashwin Bharambe	fcd22b6baa	Make Safety test work, other cleanup	2024-10-09 21:09:50 -07:00
Ashwin Bharambe	ba1f294cc6	Safety test placeholder	2024-10-09 19:35:48 -07:00
Ashwin Bharambe	b55034c0de	Another round of simplification and clarity for models/shields/memory_banks stuff	2024-10-09 19:19:26 -07:00
Ashwin Bharambe	73a0a34e39	Kill non-llama guard shields	2024-10-08 17:47:03 -07:00
Ashwin Bharambe	24c61403b7	Fixes	2024-10-08 17:43:25 -07:00
Ashwin Bharambe	a86f3ae07d	Update run.yaml	2024-10-08 17:41:06 -07:00
Ashwin Bharambe	924b1fba09	minor	2024-10-08 17:26:26 -07:00
Ashwin Bharambe	f40cd62306	Test fixes	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	8eee5b9adc	Fix server conditional awaiting on coroutines	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	216e7eb4d5	Move `async with SEMAPHORE` inside the async methods	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	4540d8bd87	move codeshield into an independent safety provider	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	380b9dab90	regen openapi specs	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	7f1160296c	Updates to server.py to clean up streaming vs non-streaming stuff Also make sure agent turn create is correctly marked	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	640c5c54f7	rename augment_messages	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	336cf7a674	update vllm; not quite tested yet	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	ed899a5dec	Convert TGI to work with openai_compat	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	05e73d12b3	introduce openai_compat with the completions (not chat-completions) API This keeps the prompt encoding layer in our control (see `chat_completion_request_to_prompt()` method)	2024-10-08 17:23:42 -07:00
Ashwin Bharambe	0c9eb3341c	Separate chat_completion stream and non-stream implementations This is a pretty important requirement. The streaming response type is an AsyncGenerator while the non-stream one is a single object. So far this has worked _sometimes_ due to various pre-existing hacks (and in some cases, just failed.)	2024-10-08 17:23:40 -07:00
Ashwin Bharambe	f8752ab8dc	weaviate fixes, test now passes	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	f21ad1173e	improve memory test, but it fails on chromadb :/	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	4ab6e1b81a	Add really basic testing for memory API weaviate does not work; the cluster URL seems malformed	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	dba7caf1d0	Fix fireworks and update the test Don't look for eom_id / eot_id sadly since providers don't return the last token	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	bbd3a02615	Make Together inference work using the raw completions API	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	3ae2b712e8	Add inference test Run it as: ``` PROVIDER_ID=test-remote \ PROVIDER_CONFIG=$PWD/llama_stack/providers/tests/inference/provider_config_example.yaml \ pytest -s llama_stack/providers/tests/inference/test_inference.py \ --tb=auto \ --disable-warnings ```	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	4fa467731e	Fix a bug in meta-reference inference when stream=False Also introduce a gross hack (to cover grosser(?) hack) to ensure non-stream requests don't send back responses in SSE format. Not sure which of these hacks is grosser.	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	353c7dc82a	A few bug fixes for covering corner cases	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	a05599c67a	Weaviate "should" work (i.e., is code-complete) but not tested	2024-10-08 17:23:02 -07:00
Zain Hasan	118c0ef105	Partial cleanup of weaviate	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	862f8ddb8d	more memory related fixes; memory.client now works	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	3725e74906	memory bank registration fixes	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	099a95b614	slight upgrade to CLI	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	1550187cd8	cleanup	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	91e0063593	Introduce model_store, shield_store, memory_bank_store	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	e45a417543	more fixes, plug shutdown handlers still, FastAPIs sigint handler is not calling ours	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	60dead6196	apis_to_serve -> apis	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	59302a86df	inference registry updates	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	4215cc9331	Push registration methods onto the backing providers	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	5a7b01d292	Significantly upgrade the interactive configuration experience	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	8d157a8197	rename	2024-10-08 17:23:02 -07:00
Ashwin Bharambe	f3923e3f0b	Redo the { models, shields, memory_banks } typeset	2024-10-08 17:23:02 -07:00
Xi Yan	6b094b72d3	Update cli_reference.md	2024-10-08 15:32:06 -07:00
Xi Yan	ce70d21f65	Add files via upload	2024-10-08 15:29:19 -07:00
Dalton Flanagan	2d4f7d8acf	Create SECURITY.md	2024-10-08 13:30:40 -04:00
Yuan Tang	48d0d2001e	Add classifiers in setup.py (#217 ) * Add classifiers in setup.py * Update setup.py * Update setup.py	2024-10-08 06:55:16 -07:00
Xi Yan	4d5f7459aa	[bugfix] Fix logprobs on meta-reference impl (#213 ) * fix log probs * add back LogProbsConfig * error handling * bugfix	2024-10-07 19:42:39 -07:00

1 2 3 4 5 ...

325 commits