Ashwin Bharambe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e84d4436b5 
								
							 
						 
						
							
							
								
								Since we are pushing for HF repos, we should accept them in inference configs ( #497 )  
							
							... 
							
							
							
							# What does this PR do?
As the title says. 
## Test Plan
This needs
8752149f58 
							
						 
						
							2024-11-20 16:14:37 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								57a9b4d57f 
								
							 
						 
						
							
							
								
								Allow models to be registered as long as llama model is provided ( #472 )  
							
							... 
							
							
							
							This PR allows models to be registered with provider as long as the user
specifies a llama model, even though the model does not match our
prebuilt provider specific mapping.
Test:
pytest -v -s
llama_stack/providers/tests/inference/test_model_registration.py -m
"together" --env TOGETHER_API_KEY=<KEY>
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com> 
							
						 
						
							2024-11-18 15:05:29 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0850ad656a 
								
							 
						 
						
							
							
								
								unregister for memory banks and remove update API ( #458 )  
							
							... 
							
							
							
							The semantics of an Update on resources is very tricky to reason about
especially for memory banks and models. The best way to go forward here
is for the user to unregister and register a new resource. We don't have
a compelling reason to support update APIs.
Tests:
pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m
"chroma" --env CHROMA_HOST=localhost --env CHROMA_PORT=8000
pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m
"pgvector" --env PGVECTOR_DB=postgres --env PGVECTOR_USER=postgres --env
PGVECTOR_PASSWORD=mysecretpassword --env PGVECTOR_HOST=0.0.0.0
$CONDA_PREFIX/bin/pytest -v -s -m "ollama"
llama_stack/providers/tests/inference/test_model_registration.py
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com> 
							
						 
						
							2024-11-14 17:12:11 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								efe791bab7 
								
							 
						 
						
							
							
								
								Support model resource updates and deletes ( #452 )  
							
							... 
							
							
							
							# What does this PR do?
* Changes the registry to store only one RoutableObject per identifier.
Before it was a list, which is not really required.
* Adds impl for updates and deletes
* Updates routing table to handle updates correctly
## Test Plan
```
❯ llama-stack-client models list
+------------------------+---------------+------------------------------------+------------+
| identifier             | provider_id   | provider_resource_id               | metadata   |
+========================+===============+====================================+============+
| Llama3.1-405B-Instruct | fireworks-0   | fireworks/llama-v3p1-405b-instruct | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.1-8B-Instruct   | fireworks-0   | fireworks/llama-v3p1-8b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.2-3B-Instruct   | fireworks-0   | fireworks/llama-v3p2-1b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
❯ llama-stack-client models register dineshyv-model --provider-model-id=fireworks/llama-v3p1-70b-instruct
Successfully registered model dineshyv-model
❯ llama-stack-client models list
+------------------------+---------------+------------------------------------+------------+
| identifier             | provider_id   | provider_resource_id               | metadata   |
+========================+===============+====================================+============+
| Llama3.1-405B-Instruct | fireworks-0   | fireworks/llama-v3p1-405b-instruct | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.1-8B-Instruct   | fireworks-0   | fireworks/llama-v3p1-8b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.2-3B-Instruct   | fireworks-0   | fireworks/llama-v3p2-1b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| dineshyv-model         | fireworks-0   | fireworks/llama-v3p1-70b-instruct  | {}         |
+------------------------+---------------+------------------------------------+------------+
❯ llama-stack-client models update dineshyv-model --provider-model-id=fireworks/llama-v3p1-405b-instruct
Successfully updated model dineshyv-model
❯ llama-stack-client models list
+------------------------+---------------+------------------------------------+------------+
| identifier             | provider_id   | provider_resource_id               | metadata   |
+========================+===============+====================================+============+
| Llama3.1-405B-Instruct | fireworks-0   | fireworks/llama-v3p1-405b-instruct | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.1-8B-Instruct   | fireworks-0   | fireworks/llama-v3p1-8b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.2-3B-Instruct   | fireworks-0   | fireworks/llama-v3p2-1b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| dineshyv-model         | fireworks-0   | fireworks/llama-v3p1-405b-instruct | {}         |
+------------------------+---------------+------------------------------------+------------+
llama-stack-client models delete dineshyv-model
❯ llama-stack-client models list
+------------------------+---------------+------------------------------------+------------+
| identifier             | provider_id   | provider_resource_id               | metadata   |
+========================+===============+====================================+============+
| Llama3.1-405B-Instruct | fireworks-0   | fireworks/llama-v3p1-405b-instruct | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.1-8B-Instruct   | fireworks-0   | fireworks/llama-v3p1-8b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
| Llama3.2-3B-Instruct   | fireworks-0   | fireworks/llama-v3p2-1b-instruct   | {}         |
+------------------------+---------------+------------------------------------+------------+
```
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com> 
							
						 
						
							2024-11-13 21:55:41 -08:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dinesh Yeduguru 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								787e2034b7 
								
							 
						 
						
							
							
								
								model registration in ollama and vllm check against the available models in the provider ( #446 )  
							
							... 
							
							
							
							tests:
pytest -v -s -m "ollama"
llama_stack/providers/tests/inference/test_text_inference.py
pytest -v -s -m vllm_remote
llama_stack/providers/tests/inference/test_text_inference.py --env
VLLM_URL="http://localhost:9798/v1 "
--------- 
							
						 
						
							2024-11-13 13:04:06 -08:00