Steve Grubb 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								122793ab92 
								
							 
						 
						
							
							
								
								Correct a traceback in vllm ( #366 )  
							
							... 
							
							
							
							File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/adapters/inference/vllm/vllm.py", line 136, in _stream_chat_completion
async for chunk in process_chat_completion_stream_response(
TypeError: process_chat_completion_stream_response() takes 2 positional arguments but 3 were given
This corrects the error by deleting the request variable 
							
						 
						
							2024-11-04 20:49:35 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								a81178f1f5 
								
							 
						 
						
							
							
								
								The server now depends on SQLite by default  
							
							
							
						 
						
							2024-11-04 20:35:53 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								9a57a009ee 
								
							 
						 
						
							
							
								
								Need to await for get_object_from_identifier() now  
							
							
							
						 
						
							2024-11-04 20:33:12 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								7cf4c905f3 
								
							 
						 
						
							
							
								
								add support for remote providers in tests  
							
							
							
						 
						
							2024-11-04 20:30:46 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								0763a0b85f 
								
							 
						 
						
							
							
								
								Fix for the fix!  
							
							
							
						 
						
							2024-11-04 20:06:01 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								fb2678b134 
								
							 
						 
						
							
							
								
								Fix shield_type and routing table breakage  
							
							
							
						 
						
							2024-11-04 19:57:15 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ffedb81c11 
								
							 
						 
						
							
							
								
								Significantly simpler and malleable test setup ( #360 )  
							
							... 
							
							
							
							* Significantly simpler and malleable test setup
* convert memory tests
* refactor fixtures and add support for composable fixtures
* Fix memory to use the newer fixture organization
* Get agents tests working
* Safety tests work
* yet another refactor to make this more general
now it accepts --inference-model, --safety-model options also
* get multiple providers working for meta-reference (for inference + safety)
* Add README.md
---------
Co-authored-by: Ashwin Bharambe <ashwin@meta.com> 
							
						 
						
							2024-11-04 17:36:43 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								663883cc29 
								
							 
						 
						
							
							
								
								persist registered objects with distribution ( #354 )  
							
							... 
							
							
							
							* persist registered objects with distribution
* linter fixes
* comment
* use annotate and field discriminator
* workign tests
* donot use global state
* precommit failures fixed
* add back Any
* fix imports
* remove unnecessary changes in ollama
* precommit failures fixed
* make kvstore configurable for dist and rename registry
* add comment about registry list return
* fix linter errors
* use registry to hydrate
* remove debug print
* linter fixes
* remove kvstore.db
* rename distribution_registry_store
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com> 
							
						 
						
							2024-11-04 17:25:06 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9bf1d7d0b 
								
							 
						 
						
							
							
								
								pgvector fixes ( #369 )  
							
							... 
							
							
							
							Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com> 
							
						 
						
							2024-11-04 17:01:09 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c810a4184d 
								
							 
						 
						
							
							
								
								[docs] update documentations ( #356 )  
							
							... 
							
							
							
							* move docs -> source
* Add files via upload
* mv image
* Add files via upload
* colocate iOS setup doc
* delete image
* Add files via upload
* fix
* delete image
* Add files via upload
* Update developer_cookbook.md
* toctree
* wip subfolder
* docs update
* subfolder
* updates
* name
* updates
* index
* updates
* refactor structure
* depth
* docs
* content
* docs
* getting started
* distributions
* fireworks
* fireworks
* update
* theme
* theme
* theme
* pdj theme
* pytorch theme
* css
* theme
* agents example
* format
* index
* headers
* copy button
* test tabs
* test tabs
* fix
* tabs
* tab
* tabs
* sphinx_design
* quick start commands
* size
* width
* css
* css
* download models
* asthetic fix
* tab format
* update
* css
* width
* css
* docs
* tab based
* tab
* tabs
* docs
* style
* image
* css
* color
* typo
* update docs
* missing links
* list templates
* links
* links update
* troubleshooting
* fix
* distributions
* docs
* fix table
* kill llamastack-local-gpu/cpu
* Update index.md
* Update index.md
* mv ios_setup.md
* Update ios_setup.md
* Add remote_or_local.gif
* Update ios_setup.md
* release notes
* typos
* Add ios_setup to index
* nav bar
* hide torctree
* ios image
* links update
* rename
* rename
* docs
* rename
* links
* distributions
* distributions
* distributions
* distributions
* remove release
* remote
---------
Co-authored-by: dltn <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> 
							
						 
						
							2024-11-04 16:52:38 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ac93dd89cf 
								
							 
						 
						
							
							
								
								fix bedrock impl ( #359 )  
							
							... 
							
							
							
							* fix bedrock impl
* fix linter errors
* fix return type and remove debug print 
							
						 
						
							2024-11-03 07:32:30 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								bf4f97a2e1 
								
							 
						 
						
							
							
								
								Fix vLLM adapter chat_completion signature  
							
							
							
						 
						
							2024-11-01 13:09:03 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dalton Flanagan 
								
							 
						 
						
							
							
							
							
								
							
							
								adecb2a2d3 
								
							 
						 
						
							
							
								
								update for message parsing on ios  
							
							
							
						 
						
							2024-11-01 14:37:19 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								37b330b4ef 
								
							 
						 
						
							
							
								
								add dynamic clients for all APIs ( #348 )  
							
							... 
							
							
							
							* add dynamic clients for all APIs
* fix openapi generator
* inference + memory + agents tests now pass with "remote" providers
* Add docstring which fixes openapi generator :/ 
							
						 
						
							2024-10-31 14:46:25 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Steve Grubb 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f04b566c5c 
								
							 
						 
						
							
							
								
								Do not cache pip ( #349 )  
							
							... 
							
							
							
							Pip has a 3.3GB cache of torch and friends. Do not keep this in the image. 
							
						 
						
							2024-10-31 09:52:40 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4aa1bf6a60 
								
							 
						 
						
							
							
								
								Kill --name from llama stack build ( #340 )  
							
							
							
						 
						
							2024-10-28 23:07:32 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								26d1668f7d 
								
							 
						 
						
							
							
								
								Revert "remove Field for return_type"  
							
							... 
							
							
							
							This reverts commit ffb3965ade 
							
						 
						
							2024-10-28 21:39:48 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								eccd7dc4a9 
								
							 
						 
						
							
							
								
								Avoid warnings from pydantic for overriding schema  
							
							... 
							
							
							
							Also fix structured output in completions 
							
						 
						
							2024-10-28 21:39:48 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ed833bb758 
								
							 
						 
						
							
							
								
								[Evals API][7/n] braintrust scoring provider ( #333 )  
							
							... 
							
							
							
							* wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* braintrust skeleton
* datasetio test fix
* braintrust provider
* remove prints
* dependencies
* change json -> class
* json -> class
* remove initialize
* address nits
* check identifier prefix
* braintrust scoring identifier check, rebase
* udpate MANIFEST
* manifest
* remove braintrust scoring_fn
* remove comments
* tests
* imports fix 
							
						 
						
							2024-10-28 18:59:35 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a70a4706fc 
								
							 
						 
						
							
							
								
								update distributions compose/readme ( #338 )  
							
							... 
							
							
							
							* readme updates
* quantied compose
* dell tgi
* config update 
							
						 
						
							2024-10-28 16:34:43 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7b8748c53e 
								
							 
						 
						
							
							
								
								[Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs ( #330 )  
							
							... 
							
							
							
							* wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* change json -> class
* remove initialize
* address nits
* check identifier prefix
* udpate MANIFEST 
							
						 
						
							2024-10-28 14:08:42 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
							
							
								
							
							
								ffb3965ade 
								
							 
						 
						
							
							
								
								remove Field for return_type  
							
							
							
						 
						
							2024-10-28 13:04:41 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								b7d2b83d55 
								
							 
						 
						
							
							
								
								Allow passing provider_registry to resolve_impls()  
							
							
							
						 
						
							2024-10-28 11:58:16 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dalton Flanagan 
								
							 
						 
						
							
							
							
							
								
							
							
								44c05c6e7d 
								
							 
						 
						
							
							
								
								add vision instruct models for fireworks  
							
							
							
						 
						
							2024-10-27 17:54:54 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9b85d9a841 
								
							 
						 
						
							
							
								
								completion() for fireworks ( #329 )  
							
							
							
						 
						
							2024-10-25 16:12:10 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7ec79f3b9d 
								
							 
						 
						
							
							
								
								completion() for together ( #324 )  
							
							... 
							
							
							
							* completion() for together
* test fixes
* fix client building 
							
						 
						
							2024-10-25 14:21:12 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								abdf7cddf3 
								
							 
						 
						
							
							
								
								[Evals API][4/n] evals with generation meta-reference impl ( #303 )  
							
							... 
							
							
							
							* wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* evals with generation
* add all rows scores to ScoringResult
* minor typing
* bugfix
* scoring function def rename
* rebase name
* refactor
* address comments
* Update iOS inference instructions for new quantization
* Small updates to quantization config
* Fix score threshold in faiss
* Bump version to 0.0.45
* Handle both ipv6 and ipv4 interfaces together
* update manifest for build templates
* Update getting_started.md
* chatcompletion & completion input type validation
* inclusion->subsetof
* error checking
* scoring_function -> scoring_fn rename, scorer -> scoring_fn rename
* address comments
* [Evals API][5/n] fixes to generate openapi spec (#323 )
* generate openapi
* typing comment, dataset -> dataset_id
* remove custom type
* sample eval run.yaml
---------
Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> 
							
						 
						
							2024-10-25 13:12:39 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sachin Mehta 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c05fbf14b3 
								
							 
						 
						
							
							
								
								Added hadamard transform for spinquant ( #326 )  
							
							... 
							
							
							
							* Added hadamard transform for spinquant
* Changed from config to model_args
* Added an assertion for model args
* Use enum.value to check against str
* pre-commit
---------
Co-authored-by: Sachin Mehta <sacmehta@fb.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> 
							
						 
						
							2024-10-25 12:58:48 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								07f9bf723f 
								
							 
						 
						
							
							
								
								fix broken --list-templates with adding build.yaml files for packaging ( #327 )  
							
							... 
							
							
							
							* add build files to templates
* fix templates
* manifest
* symlink
* symlink
* precommit
* change everything to docker build.yaml
* remove image_type in templates
* fix build from templates CLI
* fix readmes 
							
						 
						
							2024-10-25 12:51:22 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								afae4e3d8e 
								
							 
						 
						
							
							
								
								Update docker build flow a little  
							
							
							
						 
						
							2024-10-25 10:06:21 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								5bed6c276c 
								
							 
						 
						
							
							
								
								Move function around  
							
							
							
						 
						
							2024-10-25 09:18:22 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								70d59b0f5d 
								
							 
						 
						
							
							
								
								Make vllm inference better  
							
							... 
							
							
							
							Tests still don't pass completely (some hang) so I think there are some
potential threading issues maybe 
							
						 
						
							2024-10-24 22:52:47 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
							
							
								
							
							
								cb43caa2c3 
								
							 
						 
						
							
							
								
								start_container.sh prefix llamastack->distribution name  
							
							
							
						 
						
							2024-10-24 21:29:17 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sarthak Deshpande 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								df141b6ef3 
								
							 
						 
						
							
							
								
								Fix for get_agents_session ( #300 )  
							
							
							
						 
						
							2024-10-24 18:36:27 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3e1c3fdb3f 
								
							 
						 
						
							
							
								
								completion() for tgi ( #295 )  
							
							
							
						 
						
							2024-10-24 16:02:41 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cb84034567 
								
							 
						 
						
							
							
								
								[Evals API][3/n] scoring_functions / scoring meta-reference implementations ( #296 )  
							
							... 
							
							
							
							* wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* add all rows scores to ScoringResult
* bugfix
* scoring function def rename 
							
						 
						
							2024-10-24 14:52:30 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								94728d6983 
								
							 
						 
						
							
							
								
								Handle both ipv6 and ipv4 interfaces together  
							
							
							
						 
						
							2024-10-24 13:59:01 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								205bcfdd4e 
								
							 
						 
						
							
							
								
								Fix score threshold in faiss  
							
							
							
						 
						
							2024-10-24 12:11:58 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								161aef0aae 
								
							 
						 
						
							
							
								
								Small updates to quantization config  
							
							
							
						 
						
							2024-10-24 12:08:56 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dalton Flanagan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8eceebec98 
								
							 
						 
						
							
							
								
								Update iOS inference instructions for new quantization  
							
							
							
						 
						
							2024-10-24 14:47:27 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7afe51c84d 
								
							 
						 
						
							
							
								
								New quantized models ( #301 )  
							
							
							
						 
						
							2024-10-24 08:38:56 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
							
							
								
							
							
								05a8d47b98 
								
							 
						 
						
							
							
								
								Add a meta-reference-quantized-gpu distribution  
							
							
							
						 
						
							2024-10-23 21:45:50 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0cec86453b 
								
							 
						 
						
							
							
								
								Fix issue w/ routing_table api getting added when router api is not specified ( #298 )  
							
							... 
							
							
							
							* fix issue w/ enforcing api
* cleanup
* inference only yaml 
							
						 
						
							2024-10-23 15:27:22 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								21f2e9adf5 
								
							 
						 
						
							
							
								
								dont set num_predict for all providers ( #294 )  
							
							
							
						 
						
							2024-10-23 11:44:04 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ffb561070d 
								
							 
						 
						
							
							
								
								Support structured output for Together ( #289 )  
							
							
							
						 
						
							2024-10-22 22:36:38 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sarthak Deshpande 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2e5e46d896 
								
							 
						 
						
							
							
								
								Added tests for persistence ( #274 )  
							
							
							
						 
						
							2024-10-22 19:41:46 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xi Yan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								821810657f 
								
							 
						 
						
							
							
								
								[Evals API][2/n] datasets / datasetio meta-reference implementation ( #288 )  
							
							... 
							
							
							
							* skeleton dataset / datasetio
* dataset datasetio
* config
* address comments
* delete dataset_utils
* address comments
* naming fix 
							
						 
						
							2024-10-22 16:12:16 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Sarthak Deshpande 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8a01b9e40c 
								
							 
						 
						
							
							
								
								Added implementations for get_agents_session, delete_agents_session and delete_agents ( #267 )  
							
							
							
						 
						
							2024-10-22 13:50:43 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Suraj Subramanian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b81a3bd46a 
								
							 
						 
						
							
							
								
								Fix import conflict for SamplingParams ( #285 )  
							
							... 
							
							
							
							Conflict between llama_models.llama3.api.datatypes.SamplingParams and vllm.sampling_params.SamplingParams results in errors while processing VLLM engine requests 
							
						 
						
							2024-10-22 12:56:00 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c06718fbd5 
								
							 
						 
						
							
							
								
								Add support for Structured Output / Guided decoding ( #281 )  
							
							... 
							
							
							
							Added support for structured output in the API and added a reference implementation for meta-reference.
A few notes:
* Two formats are specified in the API: Json schema and EBNF based grammar
* Implementation only supports Json for now
We use lm-format-enhancer to provide the implementation right now but may change this especially because BNF grammars aren't supported by that library.
Fireworks has support for structured output and Together has limited supported for it too. Subsequent PRs will add these changes. We would like all our inference providers to provide structured output for llama models since it is an extremely important and highly sought-after need by the developers. 
							
						 
						
							2024-10-22 12:53:34 -07:00