llama-stack-mirror

phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 19:04:19 +00:00

Commit graph

Author	SHA1	Message	Date
Ashwin Bharambe	5335393fe3	Avoid deleting temp directory between agent turns This brings an interesting aspect -- we need to maintain session-level tempdir state (!) since the model was told there was some resource at a given location that it needs to maintain	2024-12-08 22:25:37 -08:00
Ashwin Bharambe	e951852848	Miscellaneous fixes around telemetry, library client and run yaml autogen Also add a `venv` image-type for llama stack build	2024-12-08 20:40:22 -08:00
Ashwin Bharambe	14f973a64f	Make LlamaStackLibraryClient work correctly (#581 ) This PR does a few things: - it moves "direct client" to llama-stack repo instead of being in the llama-stack-client-python repo - renames it to `LlamaStackLibraryClient` - actually makes synchronous generators work - makes streaming and non-streaming work properly In many ways, this PR makes things finally "work" ## Test Plan See a `library_client_test.py` I added. This isn't really quite a test yet but it demonstrates that this mode now works. Here's the invocation and the response: ``` INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct python llama_stack/distribution/tests/library_client_test.py ollama ``` ![image](https://github.com/user-attachments/assets/17d4e116-4457-4755-a14e-d9a668801fe0)	2024-12-07 14:59:36 -08:00

Author

SHA1

Message

Date

Ashwin Bharambe

5335393fe3

Avoid deleting temp directory between agent turns

This brings an interesting aspect -- we need to maintain session-level
tempdir state (!) since the model was told there was some resource at a
given location that it needs to maintain

2024-12-08 22:25:37 -08:00

Ashwin Bharambe

e951852848

Miscellaneous fixes around telemetry, library client and run yaml autogen

Also add a `venv` image-type for llama stack build

2024-12-08 20:40:22 -08:00

Ashwin Bharambe

14f973a64f

Make LlamaStackLibraryClient work correctly (#581 )

This PR does a few things:

- it moves "direct client" to llama-stack repo instead of being in the
llama-stack-client-python repo
- renames it to `LlamaStackLibraryClient`
- actually makes synchronous generators work 
- makes streaming and non-streaming work properly

In many ways, this PR makes things finally "work"

## Test Plan

See a `library_client_test.py` I added. This isn't really quite a test
yet but it demonstrates that this mode now works. Here's the invocation
and the response:

```
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct python llama_stack/distribution/tests/library_client_test.py ollama
```


![image](https://github.com/user-attachments/assets/17d4e116-4457-4755-a14e-d9a668801fe0)

2024-12-07 14:59:36 -08:00

3 commits