Xi Yan
a6091fa158
server
2024-12-26 18:35:06 -08:00
Xi Yan
74de9bebd1
registry
2024-12-26 18:34:00 -08:00
Xi Yan
27da763af9
more fixes
2024-12-26 18:30:42 -08:00
Xi Yan
6596caed55
vllm
2024-12-26 18:25:28 -08:00
Xi Yan
206554e853
stack imports
2024-12-26 18:23:40 -08:00
Xi Yan
3c84f491ec
imports
2024-12-26 18:21:53 -08:00
Xi Yan
7c12cda244
llama guard
2024-12-26 18:18:01 -08:00
Xi Yan
f58e92f8d3
prompt guard
2024-12-26 18:15:55 -08:00
Xi Yan
61be406b49
scoring
2024-12-26 18:14:53 -08:00
Xi Yan
fcac7cfafa
braintrust
2024-12-26 18:13:43 -08:00
Xi Yan
71d50ab368
telemetry & sample
2024-12-26 18:12:51 -08:00
Xi Yan
c4b9b3cb52
huggingface
2024-12-26 18:11:10 -08:00
Xi Yan
d40e527471
bedrock
2024-12-26 18:10:23 -08:00
Xi Yan
28428c320a
databricks
2024-12-26 18:08:50 -08:00
Xi Yan
6f7f02fbad
fireworks
2024-12-26 18:08:08 -08:00
Xi Yan
f97638a323
ollama import remove
2024-12-26 18:07:18 -08:00
Xi Yan
165777a181
impls imports remove
2024-12-26 18:05:19 -08:00
Xi Yan
b641902bfa
impls imports remove
2024-12-26 18:01:45 -08:00
Xi Yan
c1ef055f39
test prompt adapter
2024-12-26 17:49:17 -08:00
Xi Yan
2fe4acd64d
text inference
2024-12-26 17:45:25 -08:00
Xi Yan
16cfe1014e
vision inference
2024-12-26 17:31:42 -08:00
Xi Yan
3b1f20ac00
memory tests fix
2024-12-26 17:27:01 -08:00
Xi Yan
3f86c19150
builds
2024-12-26 17:21:23 -08:00
Xi Yan
8a8550fe9b
cli imports
2024-12-26 17:19:40 -08:00
Xi Yan
21a6bd57ea
fix imports
2024-12-26 17:17:03 -08:00
Xi Yan
c6d3fc6fb6
datatypes
2024-12-26 17:00:56 -08:00
Xi Yan
6c6b5fb091
openai_compat
2024-12-26 16:59:06 -08:00
Xi Yan
9ab0730294
kvstore
2024-12-26 16:55:40 -08:00
Xi Yan
30fee82407
vector_store
2024-12-26 16:54:33 -08:00
Xi Yan
b7bc1c6297
telemetry
2024-12-26 16:48:54 -08:00
Xi Yan
bb0a3f5c8e
remove more imports
2024-12-26 16:43:30 -08:00
Xi Yan
93ed8aa814
remove more imports
2024-12-26 16:39:31 -08:00
Xi Yan
0a0c01fbc2
test agents imports
2024-12-26 16:32:23 -08:00
Xi Yan
9bdb7236b2
Merge branch 'main' into remove_import_stars
2024-12-26 15:50:12 -08:00
Xi Yan
88c967a3e2
fix client-sdk memory/safety test
2024-12-26 15:49:15 -08:00
Xi Yan
b05d8fd956
fix client-sdk agents/inference test
2024-12-26 15:49:14 -08:00
Xi Yan
19c99e36a0
update playground doc video
2024-12-26 15:49:14 -08:00
Xi Yan
70db039ff4
fix client-sdk memory/safety test
2024-12-26 15:48:28 -08:00
Xi Yan
b6aca4c8bb
fix client-sdk agents/inference test
2024-12-26 15:44:34 -08:00
Xi Yan
da26d22f90
remove imports 1/n
2024-12-26 15:19:06 -08:00
Xi Yan
4e1d0a2fc5
update playground doc video
2024-12-26 14:50:19 -08:00
Xi Yan
28ce511986
fix --endpoint docs
2024-12-26 14:32:07 -08:00
Ikko Eltociear Ashimine
7ba95a8e74
docs: update evals_reference/index.md ( #675 )
...
# What does this PR do?
minor fix
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-26 11:32:37 -08:00
Aidan Do
21fb92d7cf
Add 3.3 70B to Ollama inference provider ( #681 )
...
# What does this PR do?
Adds 3.3 70B support to Ollama inference provider
## Test Plan
<details>
<summary>Manual</summary>
```bash
# 42GB to download
ollama pull llama3.3:70b
ollama run llama3.3:70b --keepalive 60m
export LLAMA_STACK_PORT=5000
pip install -e . \
&& llama stack build --template ollama --image-type conda \
&& llama stack run ./distributions/ollama/run.yaml \
--port $LLAMA_STACK_PORT \
--env INFERENCE_MODEL=Llama3.3-70B-Instruct \
--env OLLAMA_URL=http://localhost:11434
export LLAMA_STACK_PORT=5000
llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT \
inference chat-completion \
--model-id Llama3.3-70B-Instruct \
--message "hello, what model are you?"
```
<img width="1221" alt="image"
src="https://github.com/user-attachments/assets/dcffbdd9-94c8-4d47-9f95-4ef6c3756294 "
/>
</details>
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-25 22:15:58 -08:00
Yuan Tang
fa371fdc9e
Removed unnecessary CONDA_PREFIX env var in installation guide ( #683 )
...
This is not needed since `conda activate stack` has already been
executed.
2024-12-23 13:17:30 -08:00
Yuan Tang
987e651755
Add missing venv option in --image-type ( #677 )
...
"venv" option is supported but not mentioned in the prompt.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2024-12-21 21:10:13 -08:00
Botao Chen
bae197c37e
Fix post training apis broken by torchtune release ( #674 )
...
There is a torchtune release this morning
https://github.com/pytorch/torchtune/releases/tag/v0.5.0 and breaks post
training apis
## test
spinning up server and the post training works again after the fix
<img width="1314" alt="Screenshot 2024-12-20 at 4 08 54 PM"
src="https://github.com/user-attachments/assets/dfae724d-ebf0-4846-9715-096efa060cee "
/>
## Note
We need to think hard of how to avoid this happen again and have a fast
follow up on this after holidays
2024-12-20 16:12:02 -08:00
Botao Chen
06cb0c837e
[torchtune integration] post training + eval ( #670 )
...
## What does this PR do?
- Add related Apis in experimental-post-training template to enable eval
on the finetuned checkpoint in the template
- A small bug fix on meta reference eval
- A small error handle improvement on post training
## Test Plan
From client side issued an E2E post training request
https://github.com/meta-llama/llama-stack-client-python/pull/70 and get
eval results successfully
<img width="1315" alt="Screenshot 2024-12-20 at 12 06 59 PM"
src="https://github.com/user-attachments/assets/a09bd524-59ae-490c-908f-2e36ccf27c0a "
/>
2024-12-20 13:43:13 -08:00
Dinesh Yeduguru
c8be0bf1c9
Tools API with brave and MCP providers ( #639 )
...
This PR adds a new Tools api and adds two tool runtime providers: brave
and MCP.
Test plan:
```
curl -X POST 'http://localhost:5000/alpha/toolgroups/register ' \
-H 'Content-Type: application/json' \
-d '{ "tool_group_id": "simple_tool",
"tool_group": {
"type": "model_context_protocol",
"endpoint": {"uri": "http://localhost:56000/sse "}
},
"provider_id": "model-context-protocol"
}'
curl -X POST 'http://localhost:5000/alpha/toolgroups/register ' \
-H 'Content-Type: application/json' \
-d '{
"tool_group_id": "search", "provider_id": "brave-search",
"tool_group": {
"type": "user_defined",
"tools": [
{
"name": "brave_search",
"description": "A web search tool",
"parameters": [
{
"name": "query",
"parameter_type": "string",
"description": "The query to search"
}
],
"metadata": {},
"tool_prompt_format": "json"
}
]
}
}'
curl -X GET http://localhost:5000/alpha/tools/list | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 662 100 662 0 0 333k 0 --:--:-- --:--:-- --:--:-- 646k
[
{
"identifier": "brave_search",
"provider_resource_id": "brave_search",
"provider_id": "brave-search",
"type": "tool",
"tool_group": "search",
"description": "A web search tool",
"parameters": [
{
"name": "query",
"parameter_type": "string",
"description": "The query to search"
}
],
"metadata": {},
"tool_prompt_format": "json"
},
{
"identifier": "fetch",
"provider_resource_id": "fetch",
"provider_id": "model-context-protocol",
"type": "tool",
"tool_group": "simple_tool",
"description": "Fetches a website and returns its content",
"parameters": [
{
"name": "url",
"parameter_type": "string",
"description": "URL to fetch"
}
],
"metadata": {
"endpoint": "http://localhost:56000/sse "
},
"tool_prompt_format": "json"
}
]
curl -X POST 'http://localhost:5000/alpha/tool-runtime/invoke ' \
-H 'Content-Type: application/json' \
-d '{
"tool_name": "fetch",
"args": {
"url": "http://google.com/ "
}
}'
curl -X POST 'http://localhost:5000/alpha/tool-runtime/invoke ' \
-H 'Content-Type: application/json' -H 'X-LlamaStack-ProviderData: {"api_key": "<KEY>"}' \
-d '{
"tool_name": "brave_search",
"args": {
"query": "who is meta ceo"
}
}'
```
2024-12-19 21:25:17 -08:00
Aidan Do
17fdb47e5e
Add Llama 70B 3.3 to fireworks ( #654 )
...
# What does this PR do?
- Makes Llama 70B 3.3 available for fireworks
## Test Plan
```shell
pip install -e . \
&& llama stack build --config distributions/fireworks/build.yaml --image-type conda \
&& llama stack run distributions/fireworks/run.yaml \
--port 5000
```
```python
response = client.inference.chat_completion(
model_id="Llama3.3-70B-Instruct",
messages=[
{"role": "user", "content": "hello world"},
],
)
```
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-19 17:32:49 -08:00