# What does this PR do?
We need to change
```yaml
/v1/inference/chat-completion:
post:
responses:
'200':
description: >-
If stream=False, returns a ChatCompletionResponse with the full completion.
If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk
content:
text/event-stream:
schema:
oneOf:
- $ref: '#/components/schemas/ChatCompletionResponse'
- $ref: '#/components/schemas/ChatCompletionResponseStreamChunk'
```
into
```yaml
/v1/inference/chat-completion:
post:
responses:
'200':
description: >-
If stream=False, returns a ChatCompletionResponse with the full completion.
If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk
content:
text/event-stream:
schema:
$ref: '#/components/schemas/ChatCompletionResponseStreamChunk'
application/json:
schema:
$ref: '#/components/schemas/ChatCompletionResponse'
```
## Test Plan
**Python**
- tested in SDK sync:
https://github.com/meta-llama/llama-stack-client-python/pull/108
**Node**
- tested w/
https://gist.github.com/yanxi0830/b782f4b91e21dcccdfef8898ce55157e (SDK
udpate follow up)
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
a test exists for image.url content, but not image.data content. this
adds the former.
## Test Plan
`LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v
tests/client-sdk/inference/test_inference.py`
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.
# What does this PR do?
Fixes a bug where agents were not working when both rag and
code-interpreter were added as tools.
## Test Plan
Added a new client_sdk test which tests for this scenario
```
LLAMA_STACK_CONFIG=together pytest -s -v tests/client-sdk -k 'test_rag_and_code_agent'
```
---------
Co-authored-by: Hardik Shah <hjshah@fb.com>
# What does this PR do?
- Discussion in
https://github.com/meta-llama/llama-stack/pull/906#discussion_r1936260819
- image.data should accept base64 string as input instead of binary
bytes, change prompt_adapter to account for that.
## Test Plan
```
pytest -v tests/client-sdk/inference/test_inference.py
```
with test in https://github.com/meta-llama/llama-stack/pull/906
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
- Fix typo
- Support Llama 3.3 70B
## Test Plan
Run the following scripts and obtain the test results
Script
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming --env SAMBANOVA_API_KEY={API_KEY}
```
Result
```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-sambanova] PASSED
=========================================== 1 passed, 1 warning in 1.26s ============================================
```
Script
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming --env SAMBANOVA_API_KEY={API_KEY}
```
Result
```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-sambanova] PASSED
=========================================== 1 passed, 1 warning in 0.52s ============================================
```
## Sources
Please link relevant resources if necessary.
## Before submitting
- [N] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [Y] Updated relevant documentation.
- [N] Wrote necessary unit or integration tests.
# What does this PR do?
1) As per @mattf's suggestion, we want to mark the pytest as xfail for
providers that do not support the functionality. In this diff, we xfail
the logProbs inference tests for providers who does not support log
probs.
( log probs is only supported by together, fireworks and vllm)
2) Added logProbs support for together according to their developer
[doc](https://docs.together.ai/docs/logprobs).
## Test Plan
1) Together & Fireworks
```
export LLAMA_STACK_CONFIG=/Users/sxyi/llama-stack/llama_stack/templates/together/run.yaml
/opt/miniconda3/envs/stack/bin/pytest -s -v /Users/sxyi/llama-stack/tests/client-sdk/inference/test_inference.py
```
```
tests/client-sdk/inference/test_inference.py::test_text_completion_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_completion_log_probs_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_completion_log_probs_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_text_completion_structured_output[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_non_streaming[meta-llama/Llama-3.1-8B-Instruct-What are the names of planets in our solar system?-Earth] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_non_streaming[meta-llama/Llama-3.1-8B-Instruct-What are the names of the planets that have rings around them?-Saturn] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_streaming[meta-llama/Llama-3.1-8B-Instruct-What's the name of the Sun in latin?-Sol] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_streaming[meta-llama/Llama-3.1-8B-Instruct-What is the name of the US captial?-Washington] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_text_chat_completion_structured_output[meta-llama/Llama-3.1-8B-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_image_chat_completion_non_streaming[meta-llama/Llama-3.2-11B-Vision-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_image_chat_completion_streaming[meta-llama/Llama-3.2-11B-Vision-Instruct] PASSED
tests/client-sdk/inference/test_inference.py::test_image_chat_completion_base64_url[meta-llama/Llama-3.2-11B-Vision-Instruct] PASSED
========================================================================================== 15 passed, 2 warnings in 19.46s ===========================================================================================
```
```
export LLAMA_STACK_CONFIG=/Users/sxyi/llama-stack/llama_stack/templates/fireworks/run.yaml
/opt/miniconda3/envs/stack/bin/pytest -s -v /Users/sxyi/llama-stack/tests/client-sdk/inference/test_inference.py
```
All tests passed
2) Ollama - LogProbs tests are marked as xfailed.
```
tests/client-sdk/inference/test_inference.py::test_completion_log_probs_non_streaming[meta-llama/Llama-3.1-8B-Instruct] XFAIL (remote::ollama doesn't support log probs yet)
tests/client-sdk/inference/test_inference.py::test_completion_log_probs_streaming[meta-llama/Llama-3.1-8B-Instruct] XFAIL (remote::ollama doesn't support log probs yet)
```
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
Create a new github action that runs integration tests on fireworks and
together distro upon new PR
**Key features:**
1) Run inference client-sdk tests on fireworks and together distro. Load
distro as a library
2) Pull changes from latest github repo (llama-models) and
(llama-stack-client-python)
3) output a test summary
**Next steps:**
- Expand the ci test action to (llama-models) and
(llama-stack-client-python) repo to make sure the changes there does not
break the imports in llama-stack
## Test Plan
See [the job run triggered by this
PR](1292666319)
Fixes: #902
For the test verified that llama stack can run if built:
* With default "base" conda environment
* With new custom conda environment using `--image-name XXX` option
In both cases llama stack starts fine (was failing with "base") before
this patch.
CC: @ashwinb
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
We desperately need to document our APIs. This is the basic requirement
of having a Spec :)
This PR updates the OpenAPI generator so documentation for request
parameters and object fields can be properly added to the OpenAPI specs.
From there, this should get picked by Stainless, etc.
## Test Plan:
Updated client-sdk (See
https://github.com/meta-llama/llama-stack-client-python/pull/104) and
then ran:
```bash
cd tests/client-sdk
LLAMA_STACK_CONFIG=../../llama_stack/templates/fireworks/run.yaml pytest -s -v inference/test_inference.py agents/test_agents.py
```
# What does this PR do?
allows template distribution connect to hosted or local NIM:
use --env NVIDIA_BASE_URL=http://localhost:8000 to connect to a local
NIM running at localhost:8000
use --env NVIDIA_API_KEY=blah when connecting to hosted NIM, e.g.
NVIDIA_BASE_URL=https://integrate.api.nvidia.com
## Test Plan
- `llama stack run ./llama_stack/templates/nvidia/run.yaml` -> error,
e.g. API key is required for hosted NVIDIA NIM
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=https://integrate.api.nvidia.com` -> error, e.g. API key
is required for hosted NVIDIA NIM
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_API_KEY=REDACTED` -> successful connection to NIM on
https://integrate.api.nvidia.com
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=https://integrate.api.nvidia.com --env
NVIDIA_API_KEY=REDACTED` -> successful connection to NIM running on
integrate.api.nvidia.com
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=http://localhost:8000` -> successful connection to NIM
running on localhost:8000
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=http://localhost:8000 --env NVIDIA_API_KEY=REDACTED` ->
successful connection to NIM running on http://localhost:8000
- `llama stack run ./llama_stack/templates/nvidia/run.yaml --env
NVIDIA_BASE_URL=http://bogus` -> runtime error, e.g. ConnectionError
(TODO: this should be a startup error)
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
fix type mismatch in /v1/inference/completion
## Test Plan
`llama stack run ./llama_stack/templates/nvidia/run.yaml`
`LLAMA_STACK_BASE_URL="http://localhost:8321" pytest -v
tests/client-sdk/inference/test_inference.py`
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
ClientVersion: We don't need each SDK method to support this parameter
because you wouldn't be passing a different client version each time you
make an API call.
ProviderData: although in this case, you _could_ be passing different
API keys depending on which SDK call you make, it makes for a confusing
experience. It is best to initialize the LlamaStackClient with all the
keys which are then passed in each request.
Chroma method had the wrong signature.
## Test Plan
Start Chroma: `chroma run --path /tmp/foo/chroma2 --host localhost
--port 6001`
Modify run.yaml to include Chroma server pointing to localhost:6001 and
run `llama stack run`
Then:
```bash
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -s -v agents/test_agents.py -k rag
```
passes
Stainless ends up reformatting the YAML when we paste it in the Studio.
We cannot have that happen if we are going to ever partially automate
stainless config updates.
Try ruamel.yaml, specifically `block_seq_indent` to avoid that.
# What does this PR do?
Add win platform run command for stack
- [x] Addresses issue (#issue)
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
https://github.com/meta-llama/llama-stack/pull/889
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
This PR implements windows platform support for build_container.sh
execution from terminal. Additionally, it resolves "no support for
Terminos and PTY for Window PC" issues.
- [x] Addresses issue (#issue)
Releates issues: https://github.com/meta-llama/llama-stack/issues/826,
https://github.com/meta-llama/llama-stack/issues/726
## Test Plan
Changes were tested manually by executing standard scripts from LLama
guide:
- llama stack build --template ollama --image-type container
- llama stack build --list-templates
- llama stack build
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
Add response format for agents structured output.
- [ ] Using structured output for agents (interior_design app as an
example) (#issue)
https://github.com/meta-llama/llama-stack-apps/issues/122
## Test Plan
E2E test plan with llama-stack-apps interior_design
Please describe:
Test ran:
- provide instructions so it can be reproduced.
Start your distro:
llama stack run llama_stack/templates/fireworks/run.yaml --env
FIREWORKS_API_KEY=<API_KEY>
Run api test:
```PYTHONPATH=. python examples/interior_design_assistant/api.py localhost 5000 examples/interior_design_assistant/resources/documents/ examples/interior_design_assistant/resources/images/fireplaces```
## Sources
Results:
https://github.com/meta-llama/llama-stack-client-python/pull/72
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
fixed report generation:
1) do not initialize a new client in report.py - instead get it from
pytest fixture
2) Add "provider" for "safety" and "agents" section
3) add logprobs functionality in "inference" section
## Test Plan
See the regenerated report
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Fixing the bullets
# What does this PR do?
The bullets were not there as intended so I helped fix them.
- [x] Addresses issue (#issue)
## Test Plan
Please describe:
Ran the test, and the bullets are there now to be consistent with the
page.
## Sources
N/A
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
- Fix loading SambaNovaImpl issue
- Add LlamaGuard model support for inference
## Test Plan
Run the following unit test scripts and results
### Embedding
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_embeddings.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```
```
llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models)
llama_stack/providers/tests/inference/test_embeddings.py::TestEmbeddings::test_batch_embeddings[-sambanova] SKIPPED (This test is only applicable for embedding models)
=================================================================================================================== 2 skipped, 1 warning in 0.32s ===================================================================================================================
```
### Vision
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_vision_inference.py --inference-model meta-llama/Llama-3.2-11B-Vision-Instruct --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```
```
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image0-expected_strings0] PASSED
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_non_streaming[-sambanova-image1-expected_strings1] PASSED
llama_stack/providers/tests/inference/test_vision_inference.py::TestVisionModelInference::test_vision_chat_completion_streaming[-sambanova] PASSED
=================================================================================================================== 3 passed, 1 warning in 2.68s ====================================================================================================================
```
### Text
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```
```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-sambanova] PASSED
=================================================================================================================== 1 passed, 1 warning in 0.46s ====================================================================================================================
```
```
pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming --env SAMBANOVA_API_KEY={SAMBANOVA_API_KEY}
```
```
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-sambanova] PASSED
=================================================================================================================== 1 passed, 1 warning in 0.48s ====================================================================================================================
```
## Before submitting
- [] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [Y] Ran pre-commit to handle lint / formatting issues.
- [Y] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [Y] Updated relevant documentation.
- [Y] Wrote necessary unit or integration tests.
# What does this PR do?
When you re-initialize the library client in a notebook, we were seeing
this error:
```
Getting traces for session_id=5c8d1969-0957-49d2-b852-32cbb8ef8caf
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
[<ipython-input-11-d74bb6cdd3ab>](https://localhost:8080/#) in <cell line: 0>()
7 agent_logs = []
8
----> 9 for span in client.telemetry.query_spans(
10 attribute_filters=[
11 {"key": "session_id", "op": "eq", "value": session_id},
10 frames
[/usr/local/lib/python3.11/dist-packages/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py](https://localhost:8080/#) in query_traces(self, attribute_filters, limit, offset, order_by)
246 ) -> QueryTracesResponse:
247 return QueryTracesResponse(
--> 248 data=await self.trace_store.query_traces(
249 attribute_filters=attribute_filters,
250 limit=limit,
AttributeError: 'TelemetryAdapter' object has no attribute 'trace_store'
```
This is happening because the we were skipping some required steps for
the object state as part of the global _TRACE_PROVIDER check. This PR
moves the initialization of the object state out of the TRACE_PROVIDER
init.
# What does this PR do?
In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.
- [ ] Addresses issue (#issue)
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
# What does this PR do?
In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.
- [ ] Addresses issue (#issue)
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Earlier, we would have some unknown magic to identify the path for
remote endpoints when testing client_sdk/tests.
Removed that and now you have to explicitly pass a path
The previous curl command was wrong and did not actually check for
version correctly (status code was always 200 regardless of what you
retrieved.)
Also added tagging latest. cc @wukaixingxp
# What does this PR do?
Fix documentation to reflect new API
## Test Plan
Before:
User> What are the top 5 topics that were explained? Only list succinct
bullet points.
inference> I'm ready to help, but we haven't discussed any topics yet!
This is the start of our conversation. What would you like to talk
about? I can summarize our discussion at the end if you'd like.
Run with the change, observe relevant response
<img width="1029" alt="image"
src="https://github.com/user-attachments/assets/a7dece3c-e8b4-4a60-9092-ba544c87dffd"
/>
## Sources
Please link relevant resources if necessary.
## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Co-authored-by: Eric Huang (AI Platform) <erichuang@fb.com>
# What does this PR do?
Previously the tests hard coded the tool prompt format to be json which
will cause it to fail when using 3.2/3.3 family of models. This change
make the default to be none for the agent config and just remove the
specification in the tests.
## Test Plan
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v
tests/client-sdk/agents/test_agents.py