mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-29 15:23:51 +00:00
format
This commit is contained in:
parent
980f2ae039
commit
acefea7821
4 changed files with 27 additions and 30 deletions
|
@ -1,4 +1,6 @@
|
|||
# Building a Llama Stack Distribution
|
||||
# Developer Guide: Assemble a Llama Stack Distribution
|
||||
|
||||
> NOTE: This doc is out-of-date.
|
||||
|
||||
This guide will walk you through the steps to get started with building a Llama Stack distributiom from scratch with your choice of API providers. Please see the [Getting Started Guide](./getting_started.md) if you just want the basic steps to start a Llama Stack distribution.
|
||||
|
||||
|
@ -237,27 +239,3 @@ INFO: Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
|
|||
> You might need to use the flag `--disable-ipv6` to Disable IPv6 support
|
||||
|
||||
This server is running a Llama model locally.
|
||||
|
||||
## Step 4. Test with Client
|
||||
Once the server is setup, we can test it with a client to see the example outputs.
|
||||
|
||||
```
|
||||
curl http://localhost:5000/inference/chat_completion \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "Llama3.1-8B-Instruct",
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
|
||||
],
|
||||
"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
|
||||
}'
|
||||
|
||||
Output:
|
||||
{'completion_message': {'role': 'assistant',
|
||||
'content': 'The moon glows softly in the midnight sky, \nA beacon of wonder, as it catches the eye.',
|
||||
'stop_reason': 'out_of_tokens',
|
||||
'tool_calls': []},
|
||||
'logprobs': null}
|
||||
|
||||
```
|
||||
|
|
19
docs/source/distribution_dev/index.md
Normal file
19
docs/source/distribution_dev/index.md
Normal file
|
@ -0,0 +1,19 @@
|
|||
# Llama Stack Developer Guide
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### API Provider
|
||||
A Provider is what makes the API real -- they provide the actual implementation backing the API.
|
||||
|
||||
As an example, for Inference, we could have the implementation be backed by open source libraries like `[ torch | vLLM | TensorRT ]` as possible options.
|
||||
|
||||
A provider can also be just a pointer to a remote REST service -- for example, cloud providers or dedicated inference providers could serve these APIs.
|
||||
|
||||
### Distribution
|
||||
A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by local code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
building_distro
|
||||
```
|
|
@ -13,13 +13,13 @@ Based on your developer needs, below are references to guides to help you get st
|
|||
* Developer Need: I want to start a local Llama Stack server with my GPU using meta-reference implementations.
|
||||
* Effort: 5min
|
||||
* Guide:
|
||||
- Please see our [Getting Started Guide](./getting_started.md) on starting up a meta-reference Llama Stack server.
|
||||
- Please see our [meta-reference-gpu](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/meta-reference-gpu.html) on starting up a meta-reference Llama Stack server.
|
||||
|
||||
### Llama Stack Server with Remote Providers
|
||||
* Developer need: I want a Llama Stack distribution with a remote provider.
|
||||
* Effort: 10min
|
||||
* Guide
|
||||
- Please see our [Distributions Guide](../../../distributions/) on starting up distributions with remote providers.
|
||||
- Please see our [Distributions Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/index.html) on starting up distributions with remote providers.
|
||||
|
||||
|
||||
### On-Device (iOS) Llama Stack
|
||||
|
@ -38,4 +38,4 @@ Based on your developer needs, below are references to guides to help you get st
|
|||
* Developer Need: I want to add a new API provider to Llama Stack.
|
||||
* Effort: 3hr
|
||||
* Guide
|
||||
- Please see our [Adding a New API Provider](./new_api_provider.md) guide for adding a new API provider.
|
||||
- Please see our [Adding a New API Provider](https://llama-stack.readthedocs.io/en/latest/api_providers/new_api_provider.html) guide for adding a new API provider.
|
||||
|
|
|
@ -52,8 +52,8 @@ If so, we suggest:
|
|||
- [Docker](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/together.html#docker-start-the-distribution-single-node-cpu)
|
||||
- [Conda](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/together.html#conda-llama-stack-run-single-node-cpu)
|
||||
- `distribution-fireworks`:
|
||||
- [Docker]()
|
||||
- [Conda]()
|
||||
- [Docker](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/fireworks.html)
|
||||
- [Conda](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/fireworks.html#conda-llama-stack-run-single-node-cpu)
|
||||
|
||||
|
||||
## Build Your Llama Stack App
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue