docs: Add changelog for v0.2.2 and v0.2.3

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
This commit is contained in:
Yuan Tang 2025-04-26 21:07:48 -04:00
parent 6cf6791de1
commit d840037a15
No known key found for this signature in database

View file

@ -1,5 +1,33 @@
# Changelog # Changelog
# v0.2.3
Published on: 2025-04-25T22:46:21Z
## Highlights
* OpenAI compatible inference endpoints and client-SDK support. `client.chat.completions.create()` now works.
* significant improvements and functionality added to the nVIDIA distribution
* many improvements to the test verification suite.
* new inference providers: Ramalama, IBM WatsonX
* many improvements to the Playground UI
---
# v0.2.2
Published on: 2025-04-13T01:19:49Z
## Main changes
- Bring Your Own Provider (@leseb) - use out-of-tree provider code to execute the distribution server
- OpenAI compatible inference API in progress (@bbrowning)
- Provider verifications (@ehhuang)
- Many updates and fixes to playground
- Several llama4 related fixes
---
# v0.2.1 # v0.2.1
Published on: 2025-04-05T23:13:00Z Published on: 2025-04-05T23:13:00Z
@ -10,7 +38,7 @@ Published on: 2025-04-05T23:13:00Z
# v0.2.0 # v0.2.0
Published on: 2025-04-05T19:04:29Z Published on: 2025-04-05T19:04:29Z
## Llama 4 Support ## Llama 4 Support
Checkout more at https://www.llama.com Checkout more at https://www.llama.com
@ -85,23 +113,23 @@ Published on: 2025-03-14T22:30:51Z
### Build and Test Agents ### Build and Test Agents
* Inference: ImageType is now refactored to LlamaStackImageType * Inference: ImageType is now refactored to LlamaStackImageType
* Inference: Added tests to measure TTFT * Inference: Added tests to measure TTFT
* Inference: Bring back usage metrics * Inference: Bring back usage metrics
* Agents: Added endpoint for get agent, list agents and list sessions * Agents: Added endpoint for get agent, list agents and list sessions
* Agents: Automated conversion of type hints in client tool for lite llm format * Agents: Automated conversion of type hints in client tool for lite llm format
* Agents: Deprecated ToolResponseMessage in agent.resume API * Agents: Deprecated ToolResponseMessage in agent.resume API
* Added Provider API for listing and inspecting provider info * Added Provider API for listing and inspecting provider info
### Agent Evals and Model Customization ### Agent Evals and Model Customization
* Eval: Added new eval benchmarks Math 500 and BFCL v3 * Eval: Added new eval benchmarks Math 500 and BFCL v3
* Deploy and Monitoring of Agents * Deploy and Monitoring of Agents
* Telemetry: Fix tracing to work across coroutines * Telemetry: Fix tracing to work across coroutines
### Better Engineering ### Better Engineering
* Display code coverage for unit tests * Display code coverage for unit tests
* Updated call sites (inference, tool calls, agents) to move to async non blocking calls * Updated call sites (inference, tool calls, agents) to move to async non blocking calls
* Unit tests also run on Python 3.11, 3.12, and 3.13 * Unit tests also run on Python 3.11, 3.12, and 3.13
* Added ollama inference to Integration tests CI * Added ollama inference to Integration tests CI
* Improved documentation across examples, testing, CLI, updated providers table ) * Improved documentation across examples, testing, CLI, updated providers table )
@ -170,12 +198,12 @@ Published on: 2025-02-28T18:14:01Z
### Monitor agents ### Monitor agents
* More comprehensive logging of agent steps including client tools * More comprehensive logging of agent steps including client tools
* Telemetry inputs/outputs are now structured and queryable * Telemetry inputs/outputs are now structured and queryable
* Ability to retrieve agents session, turn, step by ids * Ability to retrieve agents session, turn, step by ids
### Better Engineering ### Better Engineering
* Moved executorch Swift code out of this repo into the llama-stack-client-swift repo, similar to kotlin * Moved executorch Swift code out of this repo into the llama-stack-client-swift repo, similar to kotlin
* Move most logging to use logger instead of prints * Move most logging to use logger instead of prints
* Completed text /chat-completion and /completion tests * Completed text /chat-completion and /completion tests
--- ---
@ -213,11 +241,11 @@ Here are the key changes coming as part of this release:
# v0.1.3 # v0.1.3
Published on: 2025-02-14T20:24:32Z Published on: 2025-02-14T20:24:32Z
## v0.1.3 Release ## v0.1.3 Release
Here are some key changes that are coming as part of this release. Here are some key changes that are coming as part of this release.
### Build and Test Agents ### Build and Test Agents
Streamlined the initial development experience Streamlined the initial development experience
- Added support for llama stack run --image-type venv - Added support for llama stack run --image-type venv
- Enhanced vector store options with new sqlite-vec provider and improved Qdrant integration - Enhanced vector store options with new sqlite-vec provider and improved Qdrant integration
@ -227,7 +255,7 @@ Streamlined the initial development experience
### Agent Evals ### Agent Evals
Better benchmarking and Agent performance assessment Better benchmarking and Agent performance assessment
- Renamed eval API /eval-task to /benchmarks - Renamed eval API /eval-task to /benchmarks
- Improved documentation and notebooks for RAG and evals - Improved documentation and notebooks for RAG and evals
### Deploy and Monitoring of Agents ### Deploy and Monitoring of Agents
Improved production readiness Improved production readiness
@ -240,7 +268,7 @@ Improved production readiness
### Better Engineering ### Better Engineering
Infrastructure and code quality improvements Infrastructure and code quality improvements
- Faster text-based chat completion tests - Faster text-based chat completion tests
- Improved testing for non-streaming agent apis - Improved testing for non-streaming agent apis
- Standardized import formatting with ruff linter - Standardized import formatting with ruff linter
- Added conventional commits standard - Added conventional commits standard
- Fixed documentation parsing issues - Fixed documentation parsing issues
@ -254,9 +282,9 @@ Published on: 2025-02-07T22:06:49Z
# TL;DR # TL;DR
- Several stabilizations to development flows after the switch to `uv` - Several stabilizations to development flows after the switch to `uv`
- Migrated CI workflows to new OSS repo - [llama-stack-ops](https://github.com/meta-llama/llama-stack-ops) - Migrated CI workflows to new OSS repo - [llama-stack-ops](https://github.com/meta-llama/llama-stack-ops)
- Added automated rebuilds for ReadTheDocs - Added automated rebuilds for ReadTheDocs
- Llama Stack server supports HTTPS - Llama Stack server supports HTTPS
- Added system prompt overrides support - Added system prompt overrides support
- Several bug fixes and improvements to documentation (check out Kubernetes deployment guide by @terrytangyuan ) - Several bug fixes and improvements to documentation (check out Kubernetes deployment guide by @terrytangyuan )
@ -273,10 +301,10 @@ A bunch of small / big improvements everywhere including support for Windows, sw
# v0.1.0 # v0.1.0
Published on: 2025-01-24T17:47:47Z Published on: 2025-01-24T17:47:47Z
We are excited to announce a stable API release of Llama Stack, which enables developers to build RAG applications and Agents using tools and safety shields, monitor and those agents with telemetry, and evaluate the agent with scoring functions. We are excited to announce a stable API release of Llama Stack, which enables developers to build RAG applications and Agents using tools and safety shields, monitor and those agents with telemetry, and evaluate the agent with scoring functions.
## Context ## Context
GenAI application developers need more than just an LLM - they need to integrate tools, connect with their data sources, establish guardrails, and ground the LLM responses effectively. Currently, developers must piece together various tools and APIs, complicating the development lifecycle and increasing costs. The result is that developers are spending more time on these integrations rather than focusing on the application logic itself. The bespoke coupling of components also makes it challenging to adopt state-of-the-art solutions in the rapidly evolving GenAI space. This is particularly difficult for open models like Llama, as best practices are not widely established in the open. GenAI application developers need more than just an LLM - they need to integrate tools, connect with their data sources, establish guardrails, and ground the LLM responses effectively. Currently, developers must piece together various tools and APIs, complicating the development lifecycle and increasing costs. The result is that developers are spending more time on these integrations rather than focusing on the application logic itself. The bespoke coupling of components also makes it challenging to adopt state-of-the-art solutions in the rapidly evolving GenAI space. This is particularly difficult for open models like Llama, as best practices are not widely established in the open.
Llama Stack was created to provide developers with a comprehensive and coherent interface that simplifies AI application development and codifies best practices across the Llama ecosystem. Since our launch in September 2024, we have seen a huge uptick in interest in Llama Stack APIs by both AI developers and from partners building AI services with Llama models. Partners like Nvidia, Fireworks, and Ollama have collaborated with us to develop implementations across various APIs, including inference, memory, and safety. Llama Stack was created to provide developers with a comprehensive and coherent interface that simplifies AI application development and codifies best practices across the Llama ecosystem. Since our launch in September 2024, we have seen a huge uptick in interest in Llama Stack APIs by both AI developers and from partners building AI services with Llama models. Partners like Nvidia, Fireworks, and Ollama have collaborated with us to develop implementations across various APIs, including inference, memory, and safety.
@ -410,3 +438,4 @@ Published on: 2024-11-20T22:18:00Z
--- ---