Commit graph

13 commits

Author SHA1 Message Date
Ashwin Bharambe
8de8eb03c8
Update the "InterleavedTextMedia" type (#635)
## What does this PR do?

This is a long-pending change and particularly important to get done
now.

Specifically:
- we cannot "localize" (aka download) any URLs from media attachments
anywhere near our modeling code. it must be done within llama-stack.
- `PIL.Image` is infesting all our APIs via `ImageMedia ->
InterleavedTextMedia` and that cannot be right at all. Anything in the
API surface must be "naturally serializable". We need a standard `{
type: "image", image_url: "<...>" }` which is more extensible
- `UserMessage`, `SystemMessage`, etc. are moved completely to
llama-stack from the llama-models repository.

See https://github.com/meta-llama/llama-models/pull/244 for the
corresponding PR in llama-models.

## Test Plan

```bash
cd llama_stack/providers/tests

pytest -s -v -k "fireworks or ollama or together" inference/test_vision_inference.py
pytest -s -v -k "(fireworks or ollama or together) and llama_3b" inference/test_text_inference.py
pytest -s -v -k chroma memory/test_memory.py \
  --env EMBEDDING_DIMENSION=384 --env CHROMA_DB_PATH=/tmp/foobar

pytest -s -v -k fireworks agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct
```

Updated the client sdk (see PR ...), installed the SDK in the same
environment and then ran the SDK tests:

```bash
cd tests/client-sdk
LLAMA_STACK_CONFIG=together pytest -s -v agents/test_agents.py
LLAMA_STACK_CONFIG=ollama pytest -s -v memory/test_memory.py

# this one needed a bit of hacking in the run.yaml to ensure I could register the vision model correctly
INFERENCE_MODEL=llama3.2-vision:latest LLAMA_STACK_CONFIG=ollama pytest -s -v inference/test_inference.py
```
2024-12-17 11:18:31 -08:00
Ashwin Bharambe
eb37fba9da Small fix to library client 2024-12-16 14:08:30 -08:00
Dinesh Yeduguru
e128f2547a
add tracing back to the lib cli (#595)
Adds back all the tracing logic removed from library client. also adds
back the logging to agent_instance.
2024-12-11 08:44:20 -08:00
Dinesh Yeduguru
2e3d3a62a5 Revert "add tracing to library client (#591)"
This reverts commit bc1fddf1df.
2024-12-10 08:50:20 -08:00
Dinesh Yeduguru
16d103842a Revert "await end_trace in libcli"
This reverts commit 7615da78b8.
2024-12-10 08:47:32 -08:00
Dinesh Yeduguru
f969b561ea Revert "Disable telemetry in library client for now"
This reverts commit 176ebddf47.
2024-12-10 08:47:18 -08:00
Ashwin Bharambe
176ebddf47 Disable telemetry in library client for now 2024-12-09 22:17:25 -08:00
Ashwin Bharambe
a4d8a6009a
Fixes for library client (#587)
Library client used _server_ side types which was no bueno. The fix here
is not the completely correct fix but it is good for enough and for the
demo notebook.
2024-12-09 17:14:37 -08:00
Dinesh Yeduguru
7615da78b8 await end_trace in libcli 2024-12-09 15:54:42 -08:00
Dinesh Yeduguru
bc1fddf1df
add tracing to library client (#591) 2024-12-09 15:46:26 -08:00
Ashwin Bharambe
a2170353af better detection for jupyter 2024-12-09 09:38:11 -08:00
Ashwin Bharambe
e951852848 Miscellaneous fixes around telemetry, library client and run yaml autogen
Also add a `venv` image-type for llama stack build
2024-12-08 20:40:22 -08:00
Ashwin Bharambe
14f973a64f
Make LlamaStackLibraryClient work correctly (#581)
This PR does a few things:

- it moves "direct client" to llama-stack repo instead of being in the
llama-stack-client-python repo
- renames it to `LlamaStackLibraryClient`
- actually makes synchronous generators work 
- makes streaming and non-streaming work properly

In many ways, this PR makes things finally "work"

## Test Plan

See a `library_client_test.py` I added. This isn't really quite a test
yet but it demonstrates that this mode now works. Here's the invocation
and the response:

```
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct python llama_stack/distribution/tests/library_client_test.py ollama
```


![image](https://github.com/user-attachments/assets/17d4e116-4457-4755-a14e-d9a668801fe0)
2024-12-07 14:59:36 -08:00