llama-stack-mirror/tests/client-sdk/inference/test_inference.py
Xi Yan 78e2bfbe7a
[tests] add client-sdk pytests & delete client.py (#638)
# What does this PR do?

**Why**
- Clean up examples which we will not maintain; reduce the surface area
to the minimal showcases

**What**
- Delete `client.py` in /apis/*
- Move all scripts to unit tests
  - SDK sync in the future will just require running pytests

**Side notes**
- `bwrap` not available on Mac so code_interpreter will not work

## Test Plan

```
LLAMA_STACK_BASE_URL=http://localhost:5000 pytest -v ./tests/client-sdk
```
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/36bfe537-628d-43c3-8479-dcfcfe2e4035"
/>


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-16 12:04:56 -08:00

74 lines
2.2 KiB
Python

# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
import pytest
from llama_stack_client.lib.inference.event_logger import EventLogger
def test_text_chat_completion(llama_stack_client):
# non-streaming
available_models = [
model.identifier
for model in llama_stack_client.models.list()
if model.identifier.startswith("meta-llama")
]
assert len(available_models) > 0
model_id = available_models[0]
response = llama_stack_client.inference.chat_completion(
model_id=model_id,
messages=[
{
"role": "user",
"content": "Hello, world!",
}
],
stream=False,
)
assert len(response.completion_message.content) > 0
# streaming
response = llama_stack_client.inference.chat_completion(
model_id=model_id,
messages=[{"role": "user", "content": "Hello, world!"}],
stream=True,
)
logs = [str(log.content) for log in EventLogger().log(response) if log is not None]
assert len(logs) > 0
assert "Assistant> " in logs[0]
def test_image_chat_completion(llama_stack_client):
available_models = [
model.identifier
for model in llama_stack_client.models.list()
if "vision" in model.identifier.lower()
]
if len(available_models) == 0:
pytest.skip("No vision models available")
model_id = available_models[0]
# non-streaming
message = {
"role": "user",
"content": [
{
"image": {
"uri": "https://www.healthypawspetinsurance.com/Images/V3/DogAndPuppyInsurance/Dog_CTA_Desktop_HeroImage.jpg"
}
},
"Describe what is in this image.",
],
}
response = llama_stack_client.inference.chat_completion(
model_id=model_id,
messages=[message],
stream=False,
)
assert len(response.completion_message.content) > 0
assert (
"dog" in response.completion_message.content.lower()
or "puppy" in response.completion_message.content.lower()
)