mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-22 12:37:53 +00:00
More Updates to Read the Docs (#856)
This commit is contained in:
parent
8a686270e9
commit
74e933cbfd
8 changed files with 405 additions and 730 deletions
|
@ -1,11 +1,20 @@
|
|||
# Using Llama Stack as a Library
|
||||
|
||||
If you are planning to use an external service for Inference (even Ollama or TGI counts as external), it is often easier to use Llama Stack as a library. This avoids the overhead of setting up a server. For [example](https://github.com/meta-llama/llama-stack-client-python/blob/main/src/llama_stack_client/lib/direct/test.py):
|
||||
If you are planning to use an external service for Inference (even Ollama or TGI counts as external), it is often easier to use Llama Stack as a library. This avoids the overhead of setting up a server.
|
||||
```python
|
||||
# setup
|
||||
pip install llama-stack
|
||||
llama stack build --template together --image-type venv
|
||||
```
|
||||
|
||||
```python
|
||||
from llama_stack_client.lib.direct.direct import LlamaStackDirectClient
|
||||
from llama_stack.distribution.library_client import LlamaStackAsLibraryClient
|
||||
|
||||
client = await LlamaStackDirectClient.from_template('ollama')
|
||||
client = LlamaStackAsLibraryClient(
|
||||
"ollama",
|
||||
# provider_data is optional, but if you need to pass in any provider specific data, you can do so here.
|
||||
provider_data = {"tavily_search_api_key": os.environ['TAVILY_SEARCH_API_KEY']}
|
||||
)
|
||||
await client.initialize()
|
||||
```
|
||||
|
||||
|
@ -14,23 +23,12 @@ This will parse your config and set up any inline implementations and remote cli
|
|||
Then, you can access the APIs like `models` and `inference` on the client and call their methods directly:
|
||||
|
||||
```python
|
||||
response = await client.models.list()
|
||||
print(response)
|
||||
```
|
||||
|
||||
```python
|
||||
response = await client.inference.chat_completion(
|
||||
messages=[UserMessage(content="What is the capital of France?", role="user")],
|
||||
model_id="Llama3.1-8B-Instruct",
|
||||
stream=False,
|
||||
)
|
||||
print("\nChat completion response:")
|
||||
print(response)
|
||||
response = client.models.list()
|
||||
```
|
||||
|
||||
If you've created a [custom distribution](https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html), you can also use the run.yaml configuration file directly:
|
||||
|
||||
```python
|
||||
client = await LlamaStackDirectClient.from_config(config_path)
|
||||
await client.initialize()
|
||||
client = LlamaStackAsLibraryClient(config_path)
|
||||
client.initialize()
|
||||
```
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue