llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 19:04:19 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	034ece0011	Ensure that deprecations for fields follow through to OpenAPI	2025-02-19 13:54:04 -08:00
Ashwin Bharambe	31a5ba5268	Add title to the json schemas	2025-02-19 13:26:39 -08:00
ehhuang	8de7cf103b	feat: support tool_choice = {required, none, <function>} (#1059 ) Summary: titled Test Plan: added tests and LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B	2025-02-18 23:25:15 -05:00
Yuan Tang	64328bfe62	fix: enable_session_persistence in AgentConfig should be optional (#1012 ) # What does this PR do? This issue was discovered in https://github.com/meta-llama/llama-stack/pull/1009#discussion_r1947036518. ## Test Plan This field is no longer required after the change. [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant) --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-02-14 09:19:53 -08:00
Xi Yan	da53dc3f5f	fix: openapi for eval-task (#1085 ) # What does this PR do? - as title ## Test Plan - the deprecated endpoint need to obey what it was before [//]: # (## Documentation)	2025-02-13 17:10:45 -08:00
Xi Yan	8b655e3cd2	fix!: update eval-tasks -> benchmarks (#1032 ) # What does this PR do? - Update `/eval-tasks` to `/benchmarks` - ⚠️ Remove differentiation between `app` v.s. `benchmark` eval task config. Now we only have `BenchmarkConfig`. The overloaded `benchmark` is confusing and do not add any value. Backward compatibility is being kept as the "type" is not being used anywhere. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan - This change is backward compatible - Run notebook test with ``` pytest -v -s --nbval-lax ./docs/getting_started.ipynb pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb ``` <img width="846" alt="image" src="https://github.com/user-attachments/assets/d2fc06a7-593a-444f-bc1f-10ab9b0c843d" /> [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant) --------- Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> Signed-off-by: Ben Browning <bbrownin@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> Signed-off-by: reidliu <reid201711@gmail.com> Co-authored-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> Co-authored-by: Ben Browning <ben324@gmail.com> Co-authored-by: Sébastien Han <seb@redhat.com> Co-authored-by: Reid <61492567+reidliu41@users.noreply.github.com> Co-authored-by: reidliu <reid201711@gmail.com> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-13 16:40:58 -08:00
Dinesh Yeduguru	d8a20e034b	feat: make telemetry attributes be dict[str,PrimitiveType] (#1055 ) # What does this PR do? Make attributes in telemetry be only primitive types and avoid arbitrary nesting. ## Test Plan ``` LLAMA_STACK_DISABLE_VERSION_CHECK=true llama stack run ~/.llama/distributions/fireworks/fireworks-run.yaml LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v tests/client-sdk/agents/test_agents.py -k "test_builtin_tool_web_search" # Verified that attributes still show up correclty in jaeger ```	2025-02-11 15:10:17 -08:00
Dinesh Yeduguru	ab7f802698	feat: add MetricResponseMixin to chat completion response types (#1050 ) # What does this PR do? Defines a MetricResponseMixin which can be inherited by any response class. Adds it to chat completion response types. This is a short term solution to allow inference API to return metrics The ideal way to do this is to have a way for all response types to include metrics and all metric events logged to the telemetry API to be included with the response To do this, we will need to augment all response types with a metrics field. We have hit a blocker from stainless SDK that prevents us from doing this. The blocker is that if we were to augment the response types that have a data field in them like so class ListModelsResponse(BaseModel): metrics: Optional[List[MetricEvent]] = None data: List[Models] ... The client SDK will need to access the data by using a .data field, which is not ergonomic. Stainless SDK does support unwrapping the response type, but it requires that the response type to only have a single field. We will need a way in the client SDK to signal that the metrics are needed and if they are needed, the client SDK has to return the full response type without unwrapping it. ## Test Plan sh run_openapi_generator.sh ./ sh stainless_sync.sh dineshyv/dev add-metrics-to-resp-v4 LLAMA_STACK_CONFIG="/Users/dineshyv/.llama/distributions/fireworks/fireworks-run.yaml" pytest -v tests/client-sdk/agents/test_agents.py	2025-02-11 14:58:12 -08:00
ehhuang	3922999118	sys_prompt support in Agent (#938 ) # What does this PR do? The current default system prompt for llama3.2 tends to overindex on tool calling and doesn't work well when the prompt does not require tool calling. This PR adds an option to override the default system prompt, and organizes tool-related configs into a new config object. - [ ] Addresses issue (#issue) ## Test Plan LLAMA_STACK_CONFIG=together pytest \-\-inference\-model=meta\-llama/Llama\-3\.3\-70B\-Instruct -s -v tests/client-sdk/agents/test_agents.py::test_override_system_message_behavior ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-02-05 21:11:32 -08:00
Ashwin Bharambe	474c4bdd7a	Make a couple properties optional (#963 )	2025-02-04 16:20:24 -08:00
Ashwin Bharambe	d67401c644	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00

1 2

61 commits