Commit graph

7 commits

Author SHA1 Message Date
Ashwin Bharambe
c1025ebfdb Delete some dead code 2024-11-21 15:20:06 -08:00
Xi Yan
945db5dac2 fix logging 2024-11-21 15:02:57 -08:00
Dinesh Yeduguru
6395dadc2b
use logging instead of prints (#499)
# What does this PR do?

This PR moves all print statements to use logging. Things changed:
- Had to add `await start_trace("sse_generator")` to server.py to
actually get tracing working. else was not seeing any logs
- If no telemetry provider is provided in the run.yaml, we will write to
stdout
- by default, the logs are going to be in JSON, but we expose an option
to configure to output in a human readable way.
2024-11-21 11:32:53 -08:00
Sarthak Deshpande
838b8d4fb5
PR-437-Fixed bug to allow system instructions after first turn (#440)
# What does this PR do?

In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.

- [This PR solves the issue where agents cannot keep track of
instructions after executing the first turn because system instructions
were not getting appended in the messages list. It also solves the issue
where turns are not being fetched in the appropriate sequence.]
Addresses issue (#issue)


## Test Plan

Please describe:
- I have a file which has a precise prompt which requires more than one
turn to be executed will share the file below. I ran that file as a
python script to make sure that the turns are being executed as per the
instructions after making the code change
 
```
import asyncio
from typing import List, Optional, Dict

from llama_stack_client import LlamaStackClient
from llama_stack_client.lib.agents.event_logger import EventLogger

from llama_stack_client.types import SamplingParams, UserMessage
from llama_stack_client.types.agent_create_params import AgentConfig

LLAMA_STACK_API_TOGETHER_URL="http://10.12.79.177:5001"

class Agent:
    def __init__(self):
        self.client = LlamaStackClient(
            base_url=LLAMA_STACK_API_TOGETHER_URL,
        )


    def create_agent(self, agent_config: AgentConfig):
        agent = self.client.agents.create(
            agent_config=agent_config,
        )
        self.agent_id = agent.agent_id
        session = self.client.agents.session.create(
            agent_id=agent.agent_id,
            session_name="example_session",
        )
        self.session_id = session.session_id

    async def execute_turn(self, content: str):
        response = self.client.agents.turn.create(
            agent_id=self.agent_id,
            session_id=self.session_id,
            messages=[
                UserMessage(content=content, role="user"),
            ],
            stream=True,
        )

        for chunk in response:
            if chunk.event.payload.event_type != "turn_complete":
                yield chunk


async def run_main():
    system_prompt="""You are an AI Agent tasked with Capturing Book Renting Information for a Library.
You will politely gather the book and user details one step at a time to send over the book to the user. Here’s how to proceed:

1.	Data Security: Inform the user that their data will be kept secure.

2.	Optional Participation: Let them know they are not required to share details but that doing so will help them learn about the books offered.

3.	Sequential Information Capture: Follow the steps below, one question at a time. Do not skip or combine questions.

Steps
Step 1: Politely ask to provide the name of the book.

Step 2: Ask for the name of the author.

Step 3: Ask for the Author's country.

Step 4: Ask for the year of publication.

Step 5: If any information is missing or seems incorrect, ask the user to re-enter that specific detail.

Step 6: Confirm that the user consents to share the entered information.

Step 7: Thank the user for providing the details and let them know they will receive an email about the book.

Do not do any validation of the user entered information.

Do not print the Steps or your internal thoughts in the response.

Do not print the prompts or data structure object in the response

Do not fill in the requested user data on your own. It has to be entered by the user only.

Finally, compile and print the user-provided information as a JSON object in your response.

"""

    agent_config = AgentConfig(
        model="Llama3.2-11B-Vision-Instruct",
        instructions=system_prompt,
        enable_session_persistence=True,
    )

    agent = Agent()
    agent.create_agent(agent_config)

    print("Agent and Session:", agent.agent_id, agent.session_id)

    while True:
        query = input("Enter your query (or type 'exit' to quit): ")
        if query.lower() == "exit":
            print("Exiting the loop.")
            break
        else:
            prompt = query
            print(f"User> {prompt}")
            response = agent.execute_turn(content=prompt)
            async for log in EventLogger().log(response):
                if log is not None:
                    log.print()

if __name__ == "__main__":
    asyncio.run(run_main())
```

Below is a screenshot of the results of the first commit
<img width="1770" alt="Screenshot 2024-11-13 at 3 15 29 PM"
src="https://github.com/user-attachments/assets/1a7a090d-fc92-49cc-a786-bfc812e3d9cc">
Below is a screenshot of the results of the second commit
<img width="1792" alt="Screenshot 2024-11-13 at 6 40 56 PM"
src="https://github.com/user-attachments/assets/a9474f75-cd8c-4d49-82cd-5ff81ff12b07">
Also a screenshot of print statement to show that the turns being
fetched now are in a sequence
<img width="1783" alt="Screenshot 2024-11-13 at 6 42 22 PM"
src="https://github.com/user-attachments/assets/b906404e-a3e4-48a2-b893-69f36bbdcb98">

## Sources

Please link relevant resources if necessary.


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
2024-11-13 10:34:04 -08:00
Ashwin Bharambe
d9d271a684
Allow specifying resources in StackRunConfig (#425)
# What does this PR do? 

This PR brings back the facility to not force registration of resources
onto the user. This is not just annoying but actually not feasible
sometimes. For example, you may have a Stack which boots up with private
providers for inference for models A and B. There is no way for the user
to actually know which model is being served by these providers now (to
be able to register it.)

How will this avoid the users needing to do registration? In a follow-up
diff, I will make sure I update the sample run.yaml files so they list
the models served by the distributions explicitly. So when users do
`llama stack build --template <...>` and run it, their distributions
come up with the right set of models they expect.

For self-hosted distributions, it also allows us to have a place to
explicit list the models that need to be served to make the "complete"
stack (including safety, e.g.)

## Test Plan

Started ollama locally with two lightweight models: Llama3.2-3B-Instruct
and Llama-Guard-3-1B.

Updated all the tests including agents. Here's the tests I ran so far:

```bash
pytest -s -v -m "fireworks and llama_3b" test_text_inference.py::TestInference \
  --env FIREWORKS_API_KEY=...

pytest -s -v -m "ollama and llama_3b" test_text_inference.py::TestInference 

pytest -s -v -m ollama test_safety.py

pytest -s -v -m faiss test_memory.py

pytest -s -v -m ollama  test_agents.py \
  --inference-model=Llama3.2-3B-Instruct --safety-model=Llama-Guard-3-1B
```

Found a few bugs here and there pre-existing that these test runs fixed.
2024-11-12 10:58:49 -08:00
Dinesh Yeduguru
38cce97597
migrate memory banks to Resource and new registration (#411)
* migrate memory banks to Resource and new registration

* address feedback

* address feedback

* fix tests

* pgvector fix

* pgvector fix v2

* remove auto discovery

* change register signature to make params required

* update client

* client fix

* use annotated union to parse

* remove base MemoryBank inheritence

---------

Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-11 17:10:44 -08:00
Ashwin Bharambe
694c142b89
Add provider deprecation support; change directory structure (#397)
* Add provider deprecation support; change directory structure

* fix a couple dangling imports

* move the meta_reference safety dir also
2024-11-07 13:04:53 -08:00
Renamed from llama_stack/providers/inline/meta_reference/agents/agent_instance.py (Browse further)