llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 07:52:37 +00:00

Author	SHA1	Message	Date
Jash Gulabrai	4999c8f9cc	fix: missing key	2025-05-06 11:21:38 -04:00
Jash Gulabrai	b1d941e1f0	Merge branch 'main' into nvidia-e2e-notebook	2025-05-06 11:12:34 -04:00
Divya	3022f7b642	feat: Adding TLS support for Remote::Milvus vector_io (#2011 ) # What does this PR do? For the Issue :- #[2010](https://github.com/meta-llama/llama-stack/issues/2010) Currently, if we try to connect the Llama stack server to a remote Milvus instance that has TLS enabled, the connection fails because TLS support is not implemented in the Llama stack codebase. As a result, users are unable to use secured Milvus deployments out of the box. After adding this , the user will be able to connect to remote::Milvus which is TLS enabled . if TLS enabled :- ``` vector_io: - provider_id: milvus provider_type: remote::milvus config: uri: "http://<host>:<port>" token: "<user>:<password>" secure: True server_pem_path: "path/to/server.pem" ``` [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan I have already tested it by connecting to a Milvus instance which is TLS enabled and i was able to start llama stack server .	2025-05-06 14:15:34 +02:00
Christina Xu	65cc971877	docs: Add TrustyAI LM-Eval to list of known external providers (#2020 ) # What does this PR do? Adds documentation for the remote [TrustyAI LM-Eval Eval Provider](https://github.com/trustyai-explainability/llama-stack-provider-lmeval). LM-Eval is a service for large language model evaluation based on the open source project [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) and is integrated into the [TrustyAI Kubernetes Operator](https://trustyai-explainability.github.io/trustyai-site/main/trustyai-operator.html).	2025-05-06 14:11:55 +02:00
Sébastien Han	a5d151e912	docs: fix typo mivus.md -> milvus.md (#2102 ) Signed-off-by: Sébastien Han <seb@redhat.com>	2025-05-05 09:48:38 -07:00
Ihar Hrachyshka	16e163da0e	docs: List external kubeflow pipelines provider prototype (#2100 ) # What does this PR do? Lists another external provider example (kfp). Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-05 10:24:52 +02:00
Christian Zaccaria	9f27578929	fix: improve Mermaid diagram visibility in dark mode (#2092 ) # What does this PR do? Closes #2078 Previously, the Agent Execution Loop diagram was barely visible in dark mode: ![image](https://github.com/user-attachments/assets/78567334-c57f-4cd0-ba93-290b20ed3aba) I experimented with styling individual classes, but ultimately found that adding an off-white background provides the best visibility in both dark and light modes: ![image](https://github.com/user-attachments/assets/419d153a-d870-410b-b635-02b95da67a3d) [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan The documentation can be built locally by following the docs: https://llama-stack.readthedocs.io/en/latest/contributing/index.html#building-the-documentation [//]: # (## Documentation)	2025-05-02 13:09:45 -07:00
Ashwin Bharambe	272d3359ee	fix: remove code interpeter implementation (#2087 ) # What does this PR do? The builtin implementation of code interpreter is not robust and has a really weak sandboxing shell (the `bubblewrap` container). Given the availability of better MCP code interpreter servers coming up, we should use them instead of baking an implementation into the Stack and expanding the vulnerability surface to the rest of the Stack. This PR only does the removal. We will add examples with how to integrate with MCPs in subsequent ones. ## Test Plan Existing tests.	2025-05-01 14:35:08 -07:00
Ihar Hrachyshka	9e6561a1ec	chore: enable pyupgrade fixes (#1806 ) # What does this PR do? The goal of this PR is code base modernization. Schema reflection code needed a minor adjustment to handle UnionTypes and collections.abc.AsyncIterator. (Both are preferred for latest Python releases.) Note to reviewers: almost all changes here are automatically generated by pyupgrade. Some additional unused imports were cleaned up. The only change worth of note can be found under `docs/openapi_generator` and `llama_stack/strong_typing/schema.py` where reflection code was updated to deal with "newer" types. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-01 14:23:50 -07:00
Derek Higgins	64829947d0	feat: Add temperature support to responses API (#2065 ) # What does this PR do? Add support for the temperature to the responses API ## Test Plan Manually tested simple case unit tests added for simple case and tool calls Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-05-01 11:47:58 -07:00
Sébastien Han	dc94433072	feat(pre-commit): enhance pre-commit hooks with additional checks (#2014 ) # What does this PR do? Add several new pre-commit hooks to improve code quality and security: - no-commit-to-branch: prevent direct commits to protected branches like `main` - check-yaml: validate YAML files - detect-private-key: prevent accidental commit of private keys - requirements-txt-fixer: maintain consistent requirements.txt format and sorting - mixed-line-ending: enforce LF line endings to avoid mixed line endings - check-executables-have-shebangs: ensure executable scripts have shebangs - check-json: validate JSON files - check-shebang-scripts-are-executable: verify shebang scripts are executable - check-symlinks: validate symlinks and report broken ones - check-toml: validate TOML files mainly for pyproject.toml The respective fixes have been included. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-04-30 11:35:49 -07:00
Nathan Weinberg	d897313e0b	feat: add additional logging to llama stack build (#1689 ) # What does this PR do? Partial revert of `fa68ded07c` this commit ensures users know where their new templates are generated and how to run the newly built distro locally discussion on Discord: `1351652390` ## Test Plan Did a local run - let me know if we want any unit testing covering this ![Screenshot from 2025-03-18 22-38-18](https://github.com/user-attachments/assets/6d5dac52-edad-4a84-992f-a3c23cda10c8) ## Documentation Updated "Zero to Hero" guide with new output --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-04-30 11:06:24 -07:00
Jash Gulabrai	012dd6891f	Merge branch 'main' into nvidia-e2e-notebook	2025-04-30 12:05:11 -04:00
Jash Gulabrai	f8f59c8335	fix: Update datasets metadata field from provider to provider_id	2025-04-30 10:52:12 -04:00
Ashwin Bharambe	4d0bfbf984	feat: add api.llama provider, llama-guard-4 model (#2058 ) This PR adds a llama-stack inference provider for `api.llama.com`, as well as adds entries for Llama-Guard-4 and updated Prompt-Guard models.	2025-04-29 10:07:41 -07:00
Jash Gulabrai	96afc98b88	Add reference to notebook in docs	2025-04-29 13:06:43 -04:00
Jash Gulabrai	2f60f3c347	fix: Consistently prefix customized models with the namespace	2025-04-29 12:57:49 -04:00
Ben Browning	8dfce2f596	feat: OpenAI Responses API (#1989 ) # What does this PR do? This provides an initial [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) implementation. The API is not yet complete, and this is more a proof-of-concept to show how we can store responses in our key-value stores and use them to support the Responses API concepts like `previous_response_id`. ## Test Plan I've added a new `tests/integration/openai_responses/test_openai_responses.py` as part of a test-driven development for this new API. I'm only testing this locally with the remote-vllm provider for now, but it should work with any of our inference providers since the only API it requires out of the inference provider is the `openai_chat_completion` endpoint. ``` VLLM_URL="http://localhost:8000/v1" \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack build --template remote-vllm --image-type venv --run ``` ``` LLAMA_STACK_CONFIG="http://localhost:8321" \ python -m pytest -v \ tests/integration/openai_responses/test_openai_responses.py \ --text-model "meta-llama/Llama-3.2-3B-Instruct" ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-04-28 14:06:00 -07:00
Sébastien Han	79851d93aa	feat: Add Kubernetes authentication (#1778 ) # What does this PR do? This commit adds a new authentication system to the Llama Stack server with support for Kubernetes and custom authentication providers. Key changes include: - Implemented KubernetesAuthProvider for validating Kubernetes service account tokens - Implemented CustomAuthProvider for validating tokens against external endpoints - this is the same code that was already present. - Added test for Kubernetes - Updated server configuration to support authentication settings - Added documentation for authentication configuration and usage The authentication system supports: - Bearer token validation - Kubernetes service account token validation - Custom authentication endpoints ## Test Plan Setup a Kube cluster using Kind or Minikube. Run a server with: ``` server: port: 8321 auth: provider_type: kubernetes config: api_server_url: http://url ca_cert_path: path/to/cert (optional) ``` Run: ``` curl -s -L -H "Authorization: Bearer $(kubectl create token my-user)" http://127.0.0.1:8321/v1/providers ``` Or replace "my-user" with your service account. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-04-28 22:24:58 +02:00
Jash Gulabrai	29f57d528d	Remove unused env vars; change the other tmp folder name; fix examples	2025-04-28 13:08:36 -04:00
Rashmi Pawar	e6bbf8d20b	feat: Add NVIDIA NeMo datastore (#1852 ) # What does this PR do? Implemetation of NeMO Datastore register, unregister API. Open Issues: - provider_id gets set to `localfs` in client.datasets.register() as it is specified in routing_tables.py: DatasetsRoutingTable see: #1860 Currently I have passed `"provider_id":"nvidia"` in metadata and have parsed that in `DatasetsRoutingTable` (Not the best approach, but just a quick workaround to make it work for now.) ## Test Plan - Unit test cases: `pytest tests/unit/providers/nvidia/test_datastore.py` ```bash ========================================================== test session starts =========================================================== platform linux -- Python 3.10.0, pytest-8.3.5, pluggy-1.5.0 rootdir: /home/ubuntu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, asyncio-0.26.0, nbval-0.11.0, metadata-3.1.1, html-4.1.1, cov-6.1.0 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 2 items tests/unit/providers/nvidia/test_datastore.py .. [100%] ============================================================ warnings summary ============================================================ ====================================================== 2 passed, 1 warning in 0.84s ====================================================== ``` cc: @dglogo, @mattf, @yanxi0830	2025-04-28 09:41:59 -07:00
Jash Gulabrai	e64961697a	Rename tmp dir to sample_data; remove print statements	2025-04-28 12:04:36 -04:00
Jash Gulabrai	73275f07b7	Merge branch 'main' into nvidia-e2e-notebook	2025-04-28 12:00:11 -04:00
Derek Higgins	0e4307de0f	docs: Fix missing --gpu all flag in Docker run commands (#2026 ) adding the --gpu all flag to Docker run commands for meta-reference-gpu distributions ensures models are loaded into GPU instead of CPU. Remove docs for meta-reference-quantized-gpu The distribution was removed in #1887 but these files were left behind. Fixes: #1798 # What does this PR do? Fixes doc to add --gpu all command to docker run [//]: # (If resolving an issue, uncomment and update the line below) Closes #1798 ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] verified in docker documentation but untested --------- Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-04-25 12:17:31 -07:00
Sajikumar JS	1bb1d9b2ba	feat: Add watsonx inference adapter (#1895 ) # What does this PR do? IBM watsonx ai added as the inference [#1741 ](https://github.com/meta-llama/llama-stack/issues/1741) [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) --------- Co-authored-by: Sajikumar JS <sajikumar.js@ibm.com>	2025-04-25 11:29:21 -07:00
Kevin Postlethwait	d9e00fca66	fix: specify nbformat version in nb (#2023 ) # What does this PR do? Adding nbformat version fixes this issue. Not sure exactly why this needs to be done, but this version was rewritten to the bottom of a nb file when I changed its name trying to get to the bottom of this. When I opened it on GH the issue was no longer present Closes #1837 ## Test Plan N/A	2025-04-25 10:10:37 +02:00
Rashmi Pawar	ace82836c1	feat: NVIDIA allow non-llama model registration (#1859 ) # What does this PR do? Adds custom model registration functionality to NVIDIAInferenceAdapter which let's the inference happen on: - post-training model - non-llama models in API Catalogue(behind https://integrate.api.nvidia.com and endpoints compatible with AyncOpenAI) ## Example Usage: ```python from llama_stack.apis.models import Model, ModelType from llama_stack.distribution.library_client import LlamaStackAsLibraryClient client = LlamaStackAsLibraryClient("nvidia") _ = client.initialize() client.models.register( model_id=model_name, model_type=ModelType.llm, provider_id="nvidia" ) response = client.inference.chat_completion( model_id=model_name, messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Write a limerick about the wonders of GPU computing."}], ) ``` ## Test Plan ```bash pytest tests/unit/providers/nvidia/test_supervised_fine_tuning.py ========================================================== test session starts =========================================================== platform linux -- Python 3.10.0, pytest-8.3.5, pluggy-1.5.0 rootdir: /home/ubuntu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0 collected 6 items tests/unit/providers/nvidia/test_supervised_fine_tuning.py ...... [100%] ============================================================ warnings summary ============================================================ ../miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076 /home/ubuntu/miniconda/envs/nvidia-1/lib/python3.10/site-packages/pydantic/fields.py:1076: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'contentEncoding'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/ warn( -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ====================================================== 6 passed, 1 warning in 1.51s ====================================================== ``` [//]: # (## Documentation) Updated Readme.md cc: @dglogo, @sumitb, @mattf	2025-04-24 17:13:33 -07:00
Jash Gulabrai	cc77f79f55	feat: Add NVIDIA Eval integration (#1890 ) # What does this PR do? This PR adds support for NVIDIA's NeMo Evaluator API to the Llama Stack eval module. The integration enables users to evaluate models via the Llama Stack interface. ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] 1. Added unit tests and successfully ran from root of project: `./scripts/unit-tests.sh tests/unit/providers/nvidia/test_eval.py` ``` tests/unit/providers/nvidia/test_eval.py::TestNVIDIAEvalImpl::test_job_cancel PASSED tests/unit/providers/nvidia/test_eval.py::TestNVIDIAEvalImpl::test_job_result PASSED tests/unit/providers/nvidia/test_eval.py::TestNVIDIAEvalImpl::test_job_status PASSED tests/unit/providers/nvidia/test_eval.py::TestNVIDIAEvalImpl::test_register_benchmark PASSED tests/unit/providers/nvidia/test_eval.py::TestNVIDIAEvalImpl::test_run_eval PASSED ``` 2. Verified I could build the Llama Stack image: `LLAMA_STACK_DIR=$(pwd) llama stack build --template nvidia --image-type venv` Documentation added to `llama_stack/providers/remote/eval/nvidia/README.md` --------- Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-24 17:12:42 -07:00
Jash Gulabrai	e24959ea9e	Fix variable name	2025-04-24 10:41:38 -04:00
Charlie Doern	a673697858	chore: rename ramalama provider (#2008 ) # What does this PR do? the ramalama team has decided to rename their external provider `ramalama-stack` (more catchy!). Update docs accordingly Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-04-24 09:34:15 +02:00
Nathan Weinberg	6a44e7ba20	docs: add API to external providers table (#2006 ) Also does a minor reorg of the columns Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-04-23 15:58:10 +02:00
Kevin Postlethwait	e0fa67c81c	docs: add examples for how to define RAG docs (#1981 ) # What does this PR do? Add examples for how to define RAGDocuments. Not sure if this is the best place for these docs. @raghotham Please advise ## Test Plan None, documentation [//]: # (## Documentation) Signed-off-by: Kevin <kpostlet@redhat.com>	2025-04-23 15:39:18 +02:00
Nathan Weinberg	d6e88e0bc6	docs: add RamaLama to list of known external providers (#2004 ) The RamaLama project now has an external provider offering for Llama Stack: https://github.com/containers/llama-stack-provider-ramalama See also: https://github.com/meta-llama/llama-stack/pull/1676 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-04-23 09:44:18 +02:00
Jash Gulabrai	0d06c654d0	feat: Update NVIDIA to GA docs; remove notebook reference until ready (#1999 ) # What does this PR do? - Update NVIDIA documentation links to GA docs - Remove reference to notebooks until merged [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-18 19:13:18 -04:00
Jash Gulabrai	8fd656dcac	Add changes	2025-04-18 16:28:04 -04:00
Jash Gulabrai	4131e8146f	Clean up instructions and implementation; reorganize notebooks	2025-04-18 16:27:19 -04:00
Sébastien Han	94f83382eb	feat: allow building distro with external providers (#1967 ) # What does this PR do? We can now build a distribution that includes external providers. Closes: https://github.com/meta-llama/llama-stack/issues/1948 ## Test Plan Build a distro with an external provider following the doc instructions. [//]: # (## Documentation) Added. Rendered: ![Screenshot 2025-04-18 at 11 26 39](https://github.com/user-attachments/assets/afcf3d50-8d30-48c3-8d24-06a4b3662881) Signed-off-by: Sébastien Han <seb@redhat.com>	2025-04-18 17:18:28 +02:00
Yuan Tang	c4570bcb48	docs: Add tips for debugging remote vLLM provider (#1992 ) # What does this PR do? This is helpful when debugging issues with vLLM + Llama Stack after this PR https://github.com/vllm-project/vllm/pull/15593 --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-04-18 14:47:47 +02:00
Yuan Tang	4c6b7005fa	fix: Fix docs lint issues (#1993 ) # What does this PR do? This was not caught as part of the CI build: `dd62a2388c`. [This PR](https://github.com/meta-llama/llama-stack/pull/1354) was too old and didn't include the additional CI builds yet. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-04-18 02:33:13 -04:00
AN YU (安宇)	dd62a2388c	docs: add notes to websearch tool and two extra example scripts (#1354 ) # What does this PR do? - Adds a note about unexpected Brave Search output appearing even when Tavily Search is called. This behavior is expected for now and is a work in progress https://github.com/meta-llama/llama-stack/issues/1229. The note aims to clear any confusion for new users. - Adds two example scripts demonstrating how to build an agent using: 1. WebSearch tool 2. WolframAlpha tool These examples provide new users with an instant understanding of how to integrate these tools. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Tested these example scripts using following steps: step 1. `ollama run llama3.2:3b-instruct-fp16 --keepalive 60m` step 2. ``` export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" export LLAMA_STACK_PORT=8321 ``` step 3: `llama stack run --image-type conda ~/llama-stack/llama_stack/templates/ollama/run.yaml` step 4: run the example script with your api keys. expected output: ![image](https://github.com/user-attachments/assets/308ddb17-a087-4cf2-8622-b085174ea0ab) ![image](https://github.com/user-attachments/assets/639f239f-8966-433d-943c-ee6b304c0d71) [//]: # (## Documentation)	2025-04-17 20:20:52 -04:00
Sébastien Han	cb874287a4	fix: resync api spec (#1987 )	2025-04-17 11:36:04 -04:00
Jash Gulabrai	0d9d333a4e	Ensure sampling_params param is included in run_eval calls	2025-04-17 10:23:21 -04:00
Alexey Rybak	326cbba579	feat(agents): add agent naming functionality (#1922 ) # What does this PR do? Allow users to name an agent and use the name in telemetry instead of relying on randomly generated agent_ids. This improves the developer experience by making it easier to find specific agents in telemetry logs. Closes #1832 ## Test Plan - Added tests to verify the agent name is properly stored and retrieved - Ran `uv run -- pytest -v tests/integration/telemetry/test_telemetry.py::test_agent_name_filtering` from the root of the project and made sure the tests pass - Ran `uv run -- pytest -v tests/integration/telemetry/test_telemetry.py::test_agent_query_spans` to verify existing code without agent names still works correctly ## Use Example ``` agent = Agent( llama_stack_client, model=text_model_id, name="CustomerSupportAgent", # New parameter instructions="You are a helpful customer support assistant" ) session_id = agent.create_session(f"test-session-{uuid4()}") ``` ## Implementation Notes - Agent names are optional string parameters with no additional validation - Names are not required to be unique - multiple agents can have the same name - The agent_id remains the unique identifier for an agent --------- Co-authored-by: raghotham <raghotham@gmail.com>	2025-04-17 07:02:47 -07:00
Ben Browning	5b8e75b392	fix: OpenAI spec cleanup for assistant requests (#1963 ) # What does this PR do? Some of our multi-turn verification tests were failing because I had accidentally marked content as a required field in the OpenAI chat completion request assistant messages, but it's actually optional. It is required for messages from other roles, but assistant is explicitly allowed to be optional. Similarly, the assistant message tool_calls field should default to None instead of an empty list. These two changes get the openai-llama-stack verification test back to 100% passing, just like it passes 100% when not behind Llama Stack. They also increase the pass rate of some of the other providers in the verification test, but don't get them to 100%. ## Test Plan I started a Llama Stack server setup to run all the verification tests (requires OPENAI_API_KEY env variable) ``` llama stack run --image-type venv tests/verifications/openai-api-verification-run.yaml ``` Then, I manually ran the verification tests to see which were failing, fix them, and ran them again after these changes to ensure they were all passing. ``` python -m pytest -s -v tests/verifications/openai_api/test_chat_completion.py --provider=openai-llama-stack ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-04-17 06:56:10 -07:00
Matthew Farrellee	4205376653	chore: add meta/llama-3.3-70b-instruct as supported nvidia inference provider model (#1985 ) see https://build.nvidia.com/meta/llama-3_3-70b-instruct	2025-04-17 06:50:40 -07:00
Jash Gulabrai	2ae1d7f4e6	docs: Add NVIDIA platform distro docs (#1971 ) # What does this PR do? Add NVIDIA platform docs that serve as a starting point for Llama Stack users and explains all supported microservices. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) --------- Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-04-17 05:54:30 -07:00
Francisco Arceo	00b232c282	chore: Fix to persist the theme preference across page navigation. (#1974 ) # What does this PR do? This PR persists the theme preference across page navigation. Currently, if the default theme is detected, it is used. But if a user flips _the default theme_ and goes to a new page, the theme will switch back to the default. This resolves that issue. ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-04-16 13:58:25 -07:00
Jash Gulabrai	6927cdf5ce	feat: NVIDIA beginner e2e notebook	2025-04-15 23:26:38 -04:00
Chirag Modi	fb8ff77ff2	docs: 0.2.2 doc updates (#1961 ) Add updates to android site readme for 0.2.2	2025-04-15 13:26:17 -07:00
Dmitry Rogozhkin	71ed47ea76	docs: add example for intel gpu in vllm remote (#1952 ) # What does this PR do? PR adds instructions to setup vLLM remote endpoint for vllm-remote llama stack distribution. ## Test Plan * Verified with manual tests of the configured vllm-remote against vllm endpoint running on the system with Intel GPU * Also verified with ci pytests (see cmdline below). Test passes in the same capacity as it does on the A10 Nvidia setup (some tests do fail which seems to be known issues with vllm remote llama stack distribution) ``` pytest -s -v tests/integration/inference/test_text_inference.py \ --stack-config=http://localhost:5001 \ --text-model=meta-llama/Llama-3.2-3B-Instruct ``` CC: @ashwinb Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-04-15 07:56:23 -07:00

1 2 3 4 5 ...

593 commits