Commit graph

813 commits

Author SHA1 Message Date
Xi Yan
db49fc8ad0 more robust agent test 2024-12-27 11:20:56 -08:00
Xi Yan
52d1e4f85e import 2024-12-27 11:11:14 -08:00
Xi Yan
e337e8f742 more robust agent test 2024-12-27 10:46:47 -08:00
Xi Yan
562ef41ff8 fix tests 2024-12-26 18:56:23 -08:00
Xi Yan
50764d76a7 agents remove imports 2024-12-26 18:47:46 -08:00
Xi Yan
b936503784 inspect 2024-12-26 18:42:57 -08:00
Xi Yan
a6091fa158 server 2024-12-26 18:35:06 -08:00
Xi Yan
74de9bebd1 registry 2024-12-26 18:34:00 -08:00
Xi Yan
27da763af9 more fixes 2024-12-26 18:30:42 -08:00
Xi Yan
6596caed55 vllm 2024-12-26 18:25:28 -08:00
Xi Yan
206554e853 stack imports 2024-12-26 18:23:40 -08:00
Xi Yan
3c84f491ec imports 2024-12-26 18:21:53 -08:00
Xi Yan
7c12cda244 llama guard 2024-12-26 18:18:01 -08:00
Xi Yan
f58e92f8d3 prompt guard 2024-12-26 18:15:55 -08:00
Xi Yan
61be406b49 scoring 2024-12-26 18:14:53 -08:00
Xi Yan
fcac7cfafa braintrust 2024-12-26 18:13:43 -08:00
Xi Yan
71d50ab368 telemetry & sample 2024-12-26 18:12:51 -08:00
Xi Yan
c4b9b3cb52 huggingface 2024-12-26 18:11:10 -08:00
Xi Yan
d40e527471 bedrock 2024-12-26 18:10:23 -08:00
Xi Yan
28428c320a databricks 2024-12-26 18:08:50 -08:00
Xi Yan
6f7f02fbad fireworks 2024-12-26 18:08:08 -08:00
Xi Yan
f97638a323 ollama import remove 2024-12-26 18:07:18 -08:00
Xi Yan
165777a181 impls imports remove 2024-12-26 18:05:19 -08:00
Xi Yan
b641902bfa impls imports remove 2024-12-26 18:01:45 -08:00
Xi Yan
c1ef055f39 test prompt adapter 2024-12-26 17:49:17 -08:00
Xi Yan
2fe4acd64d text inference 2024-12-26 17:45:25 -08:00
Xi Yan
16cfe1014e vision inference 2024-12-26 17:31:42 -08:00
Xi Yan
3b1f20ac00 memory tests fix 2024-12-26 17:27:01 -08:00
Xi Yan
3f86c19150 builds 2024-12-26 17:21:23 -08:00
Xi Yan
8a8550fe9b cli imports 2024-12-26 17:19:40 -08:00
Xi Yan
21a6bd57ea fix imports 2024-12-26 17:17:03 -08:00
Xi Yan
c6d3fc6fb6 datatypes 2024-12-26 17:00:56 -08:00
Xi Yan
6c6b5fb091 openai_compat 2024-12-26 16:59:06 -08:00
Xi Yan
9ab0730294 kvstore 2024-12-26 16:55:40 -08:00
Xi Yan
30fee82407 vector_store 2024-12-26 16:54:33 -08:00
Xi Yan
b7bc1c6297 telemetry 2024-12-26 16:48:54 -08:00
Xi Yan
bb0a3f5c8e remove more imports 2024-12-26 16:43:30 -08:00
Xi Yan
93ed8aa814 remove more imports 2024-12-26 16:39:31 -08:00
Xi Yan
0a0c01fbc2 test agents imports 2024-12-26 16:32:23 -08:00
Xi Yan
9bdb7236b2 Merge branch 'main' into remove_import_stars 2024-12-26 15:50:12 -08:00
Xi Yan
88c967a3e2 fix client-sdk memory/safety test 2024-12-26 15:49:15 -08:00
Xi Yan
b05d8fd956 fix client-sdk agents/inference test 2024-12-26 15:49:14 -08:00
Xi Yan
19c99e36a0 update playground doc video 2024-12-26 15:49:14 -08:00
Xi Yan
70db039ff4 fix client-sdk memory/safety test 2024-12-26 15:48:28 -08:00
Xi Yan
b6aca4c8bb fix client-sdk agents/inference test 2024-12-26 15:44:34 -08:00
Xi Yan
da26d22f90 remove imports 1/n 2024-12-26 15:19:06 -08:00
Xi Yan
4e1d0a2fc5 update playground doc video 2024-12-26 14:50:19 -08:00
Xi Yan
28ce511986 fix --endpoint docs 2024-12-26 14:32:07 -08:00
Ikko Eltociear Ashimine
7ba95a8e74
docs: update evals_reference/index.md (#675)
# What does this PR do?

minor fix




## Sources

Please link relevant resources if necessary.


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-26 11:32:37 -08:00
Aidan Do
21fb92d7cf
Add 3.3 70B to Ollama inference provider (#681)
# What does this PR do?

Adds 3.3 70B support to Ollama inference provider

## Test Plan

<details>
<summary>Manual</summary>

```bash
# 42GB to download
ollama pull llama3.3:70b

ollama run llama3.3:70b --keepalive 60m

export LLAMA_STACK_PORT=5000
pip install -e . \
  && llama stack build --template ollama --image-type conda \
  && llama stack run ./distributions/ollama/run.yaml \
  --port $LLAMA_STACK_PORT \
  --env INFERENCE_MODEL=Llama3.3-70B-Instruct \
  --env OLLAMA_URL=http://localhost:11434

export LLAMA_STACK_PORT=5000
llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT \
  inference chat-completion \
  --model-id Llama3.3-70B-Instruct \
  --message "hello, what model are you?"
```

<img width="1221" alt="image"
src="https://github.com/user-attachments/assets/dcffbdd9-94c8-4d47-9f95-4ef6c3756294"
/>

</details>

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-25 22:15:58 -08:00