Commit graph

18 commits

Author SHA1 Message Date
Dinesh Yeduguru
314806cde3
Add provider data passing for library client (#750)
# What does this PR do?

This PR adds the provider data passing for the library client and
changes the provider's api keys be unique


## Test Plan

LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-fireworks/fireworks-run.yaml"
pytest -v tests/client-sdk/agents/test_agents.py

run.yaml:
https://gist.github.com/dineshyv/0c10b5c7d0a2fb7ba4f0ecc8dcf860d1
2025-01-13 15:12:10 -08:00
Dinesh Yeduguru
a5c57cd381
agents to use tools api (#673)
# What does this PR do?

PR #639 introduced the notion of Tools API and ability to invoke tools
through API just as any resource. This PR changes the Agents to start
using the Tools API to invoke tools. Major changes include:
1) Ability to specify tool groups with AgentConfig
2) Agent gets the corresponding tool definitions for the specified tools
and pass along to the model
3) Attachements are now named as Documents and their behavior is mostly
unchanged from user perspective
4) You can specify args that can be injected to a tool call through
Agent config. This is especially useful in case of memory tool, where
you want the tool to operate on a specific memory bank.
5) You can also register tool groups with args, which lets the agent
inject these as well into the tool call.
6) All tests have been migrated to use new tools API and fixtures
including client SDK tests
7) Telemetry just works with tools API because of our trace protocol
decorator


## Test Plan
```
pytest -s -v -k fireworks llama_stack/providers/tests/agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct

pytest -s -v -k together  llama_stack/providers/tests/tools/test_tools.py \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct

LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/llamastack-together/together-run.yaml" pytest -v tests/client-sdk/agents/test_agents.py
```
run.yaml:
https://gist.github.com/dineshyv/0365845ad325e1c2cab755788ccc5994

Notebook:
https://colab.research.google.com/drive/1ck7hXQxRl6UvT-ijNRZ-gMZxH1G3cN2d?usp=sharing
2025-01-08 19:01:00 -08:00
Dinesh Yeduguru
0bc5d05243
remove default logger handlers when using libcli with notebook (#718)
# What does this PR do?

Remove the default log handlers for notebook to avoid polluting logs
2025-01-06 13:06:22 -08:00
Ashwin Bharambe
e3f187fb83 Redact sensitive information from configs when printing, etc. 2025-01-02 13:54:02 -08:00
Dinesh Yeduguru
8b8d1c1ef4
fix trace starting in library client (#655)
# What does this PR do?

Because of the way library client sets up async io boundaries, tracing
was broken with streaming. This PR fixes the tracing to start at the
right way to caputre the life time of async gen functions correctly.

Test plan:
Script ran:
https://gist.github.com/yanxi0830/f6645129e55ab12de3cd6ec71564c69e

Before: No spans returned for a session


Now: We see spans
<img width="1678" alt="Screenshot 2024-12-18 at 9 50 46 PM"
src="https://github.com/user-attachments/assets/58a3b0dd-a41c-489a-b89a-075e698a2c03"
/>
2024-12-19 16:13:52 -08:00
Ashwin Bharambe
8de8eb03c8
Update the "InterleavedTextMedia" type (#635)
## What does this PR do?

This is a long-pending change and particularly important to get done
now.

Specifically:
- we cannot "localize" (aka download) any URLs from media attachments
anywhere near our modeling code. it must be done within llama-stack.
- `PIL.Image` is infesting all our APIs via `ImageMedia ->
InterleavedTextMedia` and that cannot be right at all. Anything in the
API surface must be "naturally serializable". We need a standard `{
type: "image", image_url: "<...>" }` which is more extensible
- `UserMessage`, `SystemMessage`, etc. are moved completely to
llama-stack from the llama-models repository.

See https://github.com/meta-llama/llama-models/pull/244 for the
corresponding PR in llama-models.

## Test Plan

```bash
cd llama_stack/providers/tests

pytest -s -v -k "fireworks or ollama or together" inference/test_vision_inference.py
pytest -s -v -k "(fireworks or ollama or together) and llama_3b" inference/test_text_inference.py
pytest -s -v -k chroma memory/test_memory.py \
  --env EMBEDDING_DIMENSION=384 --env CHROMA_DB_PATH=/tmp/foobar

pytest -s -v -k fireworks agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct
```

Updated the client sdk (see PR ...), installed the SDK in the same
environment and then ran the SDK tests:

```bash
cd tests/client-sdk
LLAMA_STACK_CONFIG=together pytest -s -v agents/test_agents.py
LLAMA_STACK_CONFIG=ollama pytest -s -v memory/test_memory.py

# this one needed a bit of hacking in the run.yaml to ensure I could register the vision model correctly
INFERENCE_MODEL=llama3.2-vision:latest LLAMA_STACK_CONFIG=ollama pytest -s -v inference/test_inference.py
```
2024-12-17 11:18:31 -08:00
Ashwin Bharambe
eb37fba9da Small fix to library client 2024-12-16 14:08:30 -08:00
Dinesh Yeduguru
e128f2547a
add tracing back to the lib cli (#595)
Adds back all the tracing logic removed from library client. also adds
back the logging to agent_instance.
2024-12-11 08:44:20 -08:00
Dinesh Yeduguru
2e3d3a62a5 Revert "add tracing to library client (#591)"
This reverts commit bc1fddf1df.
2024-12-10 08:50:20 -08:00
Dinesh Yeduguru
16d103842a Revert "await end_trace in libcli"
This reverts commit 7615da78b8.
2024-12-10 08:47:32 -08:00
Dinesh Yeduguru
f969b561ea Revert "Disable telemetry in library client for now"
This reverts commit 176ebddf47.
2024-12-10 08:47:18 -08:00
Ashwin Bharambe
176ebddf47 Disable telemetry in library client for now 2024-12-09 22:17:25 -08:00
Ashwin Bharambe
a4d8a6009a
Fixes for library client (#587)
Library client used _server_ side types which was no bueno. The fix here
is not the completely correct fix but it is good for enough and for the
demo notebook.
2024-12-09 17:14:37 -08:00
Dinesh Yeduguru
7615da78b8 await end_trace in libcli 2024-12-09 15:54:42 -08:00
Dinesh Yeduguru
bc1fddf1df
add tracing to library client (#591) 2024-12-09 15:46:26 -08:00
Ashwin Bharambe
a2170353af better detection for jupyter 2024-12-09 09:38:11 -08:00
Ashwin Bharambe
e951852848 Miscellaneous fixes around telemetry, library client and run yaml autogen
Also add a `venv` image-type for llama stack build
2024-12-08 20:40:22 -08:00
Ashwin Bharambe
14f973a64f
Make LlamaStackLibraryClient work correctly (#581)
This PR does a few things:

- it moves "direct client" to llama-stack repo instead of being in the
llama-stack-client-python repo
- renames it to `LlamaStackLibraryClient`
- actually makes synchronous generators work 
- makes streaming and non-streaming work properly

In many ways, this PR makes things finally "work"

## Test Plan

See a `library_client_test.py` I added. This isn't really quite a test
yet but it demonstrates that this mode now works. Here's the invocation
and the response:

```
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct python llama_stack/distribution/tests/library_client_test.py ollama
```


![image](https://github.com/user-attachments/assets/17d4e116-4457-4755-a14e-d9a668801fe0)
2024-12-07 14:59:36 -08:00