mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
chore(docs): Remove Llama 4 support details from README (#4178)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.12) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Test External API and Providers / test-external (venv) (push) Failing after 45s
UI Tests / ui-tests (22) (push) Successful in 48s
Vector IO Integration Tests / test-matrix (push) Failing after 1m6s
Unit Tests / unit-tests (3.13) (push) Failing after 1m28s
Unit Tests / unit-tests (3.12) (push) Failing after 1m29s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s
Pre-commit / pre-commit (push) Successful in 3m42s
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.12) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Test External API and Providers / test-external (venv) (push) Failing after 45s
UI Tests / ui-tests (22) (push) Successful in 48s
Vector IO Integration Tests / test-matrix (push) Failing after 1m6s
Unit Tests / unit-tests (3.13) (push) Failing after 1m28s
Unit Tests / unit-tests (3.12) (push) Failing after 1m29s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s
Pre-commit / pre-commit (push) Successful in 3m42s
This commit is contained in:
parent
29f1fa6abd
commit
7093978754
1 changed files with 0 additions and 77 deletions
77
README.md
77
README.md
|
|
@ -10,83 +10,6 @@
|
|||
[**Quick Start**](https://llamastack.github.io/docs/getting_started/quickstart) | [**Documentation**](https://llamastack.github.io/docs) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)
|
||||
|
||||
|
||||
### ✨🎉 Llama 4 Support 🎉✨
|
||||
We released [Version 0.2.0](https://github.com/meta-llama/llama-stack/releases/tag/v0.2.0) with support for the Llama 4 herd of models released by Meta.
|
||||
|
||||
<details>
|
||||
|
||||
<summary>👋 Click here to see how to run Llama 4 models on Llama Stack </summary>
|
||||
|
||||
\
|
||||
*Note you need 8xH100 GPU-host to run these models*
|
||||
|
||||
```bash
|
||||
pip install -U llama_stack
|
||||
|
||||
MODEL="Llama-4-Scout-17B-16E-Instruct"
|
||||
# get meta url from llama.com
|
||||
huggingface-cli download meta-llama/$MODEL --local-dir ~/.llama/$MODEL
|
||||
|
||||
# install dependencies for the distribution
|
||||
llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
|
||||
|
||||
# start a llama stack server
|
||||
INFERENCE_MODEL=meta-llama/$MODEL llama stack run meta-reference-gpu
|
||||
|
||||
# install client to interact with the server
|
||||
pip install llama-stack-client
|
||||
```
|
||||
### CLI
|
||||
```bash
|
||||
# Run a chat completion
|
||||
MODEL="Llama-4-Scout-17B-16E-Instruct"
|
||||
|
||||
llama-stack-client --endpoint http://localhost:8321 \
|
||||
inference chat-completion \
|
||||
--model-id meta-llama/$MODEL \
|
||||
--message "write a haiku for meta's llama 4 models"
|
||||
|
||||
OpenAIChatCompletion(
|
||||
...
|
||||
choices=[
|
||||
OpenAIChatCompletionChoice(
|
||||
finish_reason='stop',
|
||||
index=0,
|
||||
message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam(
|
||||
role='assistant',
|
||||
content='...**Silent minds awaken,** \n**Whispers of billions of words,** \n**Reasoning breaks the night.** \n\n— \n*This haiku blends the essence of LLaMA 4\'s capabilities with nature-inspired metaphor, evoking its vast training data and transformative potential.*',
|
||||
...
|
||||
),
|
||||
...
|
||||
)
|
||||
],
|
||||
...
|
||||
)
|
||||
```
|
||||
### Python SDK
|
||||
```python
|
||||
from llama_stack_client import LlamaStackClient
|
||||
|
||||
client = LlamaStackClient(base_url=f"http://localhost:8321")
|
||||
|
||||
model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"
|
||||
prompt = "Write a haiku about coding"
|
||||
|
||||
print(f"User> {prompt}")
|
||||
response = client.chat.completions.create(
|
||||
model=model_id,
|
||||
messages=[
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": prompt},
|
||||
],
|
||||
)
|
||||
print(f"Assistant> {response.choices[0].message.content}")
|
||||
```
|
||||
As more providers start supporting Llama 4, you can use them in Llama Stack as well. We are adding to the list. Stay tuned!
|
||||
|
||||
|
||||
</details>
|
||||
|
||||
### 🚀 One-Line Installer 🚀
|
||||
|
||||
To try Llama Stack locally, run:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue