Commit graph

  • 14ff4c647c include content in the message even if you have parsed out a tool call Ashwin Bharambe 2025-04-12 11:23:25 -07:00
  • 771daa4b91 fix test, fix llama3 generator Ashwin Bharambe 2025-04-12 10:51:43 -07:00
  • a3b921a5a8 update integration test workflow Matthew Farrellee 2025-04-01 16:29:15 -04:00
  • 854c2ad264
    fix: misleading help text for 'llama stack build' and 'llama stack run' (#1910) Nathan Weinberg 2025-04-12 04:19:11 -04:00
  • 0751a960a5
    feat: make training config fields optional (#1861) Charlie Doern 2025-04-12 04:13:45 -04:00
  • 70a7e4d51e fix: unhide python_start, python_end Ashwin Bharambe 2025-04-11 20:30:44 -07:00
  • 172a918fe3 Merge branch 'main' into feat/litellm_sambanova_usage jhpiedrahitao 2025-04-11 19:28:02 -05:00
  • 07488dbfb6
    Update README.md Yuan Tang 2025-04-11 20:19:34 -04:00
  • a3cee70014 kill experimental attr on webmethod Ashwin Bharambe 2025-04-11 17:13:46 -07:00
  • 51492bd9b6
    docs: Update docs and fix warning in start-stack.sh (#1937) Aidan Reilly 2025-04-12 00:26:17 +01:00
  • 1d855461d5 kill batch inference registry Ashwin Bharambe 2025-04-11 16:21:21 -07:00
  • 73d927850e updates Ashwin Bharambe 2025-04-11 16:15:59 -07:00
  • 0cfb2e2473 feat: add batch inference API to llama stack inference Ashwin Bharambe 2025-04-08 13:50:52 -07:00
  • 43993cc29c Merge branch 'main' into nvidia-eval-integration Jash Gulabrai 2025-04-11 17:28:26 -04:00
  • ed58a94b30
    docs: fixes to quick start (#1943) raghotham 2025-04-11 13:41:23 -07:00
  • 2b2db5fbda
    feat: OpenAI-Compatible models, completions, chat/completions (#1894) Ben Browning 2025-04-11 16:14:17 -04:00
  • b5438c9c82
    Update docs/source/getting_started/index.md raghotham 2025-04-11 13:07:58 -07:00
  • 8253d44c5c docs fixes Raghotham Murthy 2025-04-11 12:48:39 -07:00
  • 24d70cedca
    docs: Updated docs to show minimal RAG example and some other minor changes (#1935) Francisco Arceo 2025-04-11 12:50:36 -06:00
  • c1cb6aad11
    feat: Add unit tests for NVIDIA safety (#1897) Jash Gulabrai 2025-04-11 14:49:55 -04:00
  • 2a74f0db39
    fix: remove extra sft args in NvidiaPostTrainingAdapter (#1939) Ben Browning 2025-04-11 13:17:57 -04:00
  • 40f41af2f7
    feat: Add a direct (non-agentic) RAG option to the Playground RAG page (#1940) Ilya Kolchinsky 2025-04-11 19:16:10 +02:00
  • c6fa47db6f
    fix: ensure resource registration arguments are typed (#1941) Matthew Farrellee 2025-04-11 12:25:57 -04:00
  • 3465565df1 fix: ensure resource registration arguments are typed Matthew Farrellee 2025-04-11 11:24:36 -04:00
  • 6cf036f52e Add a direct (non-agentic) RAG option to the Playground UI. ilya-kolchinsky 2025-04-11 16:14:41 +02:00
  • c2d23ddd75 fix: remove extra sft args in NvidiaPostTrainingAdapter Ben Browning 2025-04-11 09:46:16 -04:00
  • dae8fd0a36
    update start-stack.sh with missing color and if statment logic Aidan Reilly 2025-04-11 10:30:12 +01:00
  • 1322bb9bf7 Merge branch 'refs/heads/main' into rag-demo ilya-kolchinsky 2025-04-11 13:55:38 +02:00
  • 6aa459b00c
    docs: fix errors in kubernetes deployment guide (#1914) Mark Campbell 2025-04-11 12:04:13 +01:00
  • d40d3a9b31
    docs: Move Llama 4 instructions in a collapsed section Yuan Tang 2025-04-10 22:32:31 -04:00
  • be112fad4f changed copy Francisco Javier Arceo 2025-04-10 22:15:00 -04:00
  • 59861a4ea5 docs: Updated docs to show minimal RAG example and some other minor changes Francisco Javier Arceo 2025-04-10 22:08:05 -04:00
  • 0e5574cf9d feat: allow ollama to use 'latest' if available but not specified Nathan Weinberg 2025-04-08 15:56:19 -04:00
  • d402623f96 fix: misleading help text for 'llama stack build' and 'llama stack run' Nathan Weinberg 2025-04-09 10:17:05 -04:00
  • 2fcb70b789
    test(verification): overwrite test result instead of creating new ones (#1934) ehhuang 2025-04-10 16:59:28 -07:00
  • a4cc4b7e31
    test(verification): add streaming tool calling test (#1933) ehhuang 2025-04-10 16:58:06 -07:00
  • 2f67f67b43 text(verification): overwrite test result instead of creating new ones Eric Huang 2025-04-10 16:52:52 -07:00
  • 373e392b10 test(verification): add streaming tool calling test Eric Huang 2025-04-10 16:23:11 -07:00
  • 913e9679c2 docs: update tmp directory creation Bobbins228 2025-04-09 17:05:21 +01:00
  • 1a065c7d63 docs: alter hf token substitution Bobbins228 2025-04-09 16:58:35 +01:00
  • 6735344604 docs: fix errors in kubernetes deployment guide Bobbins228 2025-04-09 16:33:36 +01:00
  • 49955a06b1
    docs: Update quickstart page to structure things a little more for the novices (#1873) Francisco Arceo 2025-04-10 15:09:00 -06:00
  • edd9aaac3b
    fix: use torchao 0.8.0 for inference (#1925) Sébastien Han 2025-04-10 22:39:20 +02:00
  • 79fc81f78f
    fix: Playground RAG page errors (#1928) Ilya Kolchinsky 2025-04-10 22:38:31 +02:00
  • d7c976c6d2
    Merge branch 'main' into docs-4 Francisco Arceo 2025-04-10 14:15:54 -06:00
  • 82b485b177 Added unit tests for the query() method. ilya-kolchinsky 2025-04-10 21:40:32 +02:00
  • 31181c070b Fireworks provider support for OpenAI API endpoints Ben Browning 2025-04-10 15:29:32 -04:00
  • 9120e07d9d
    Add support for RamaLama Daniel J Walsh 2025-02-11 13:47:13 -05:00
  • ffae192540 Bug fixes for together.ai OpenAI endpoints Ben Browning 2025-04-10 14:19:48 -04:00
  • a5827f7cb3 Nvidia provider support for OpenAI API endpoints Ben Browning 2025-04-10 13:43:28 -04:00
  • de6ec5803e
    fix: Fix linter failures from #1921 (#1932) Francisco Arceo 2025-04-10 11:37:31 -06:00
  • afa1082813 fix: Fix linter failure Francisco Javier Arceo 2025-04-10 13:34:22 -04:00
  • 178a5c3b93 moved the test from test_telemetry to test_agents reluctantfuturist 2025-04-10 10:30:10 -07:00
  • 14146e4b3f
    feat(verification): various improvements (#1921) ehhuang 2025-04-10 10:26:19 -07:00
  • 09a83b1ec1
    docs: Updating background color for code in darkmode (#1930) Francisco Arceo 2025-04-10 10:38:57 -06:00
  • 1f2df59ece
    docs: fix model name (#1926) Sébastien Han 2025-04-10 18:37:48 +02:00
  • 13c660f5a5
    Merge branch 'meta-llama:main' into feat/litellm_sambanova_usage Jorge Piedrahita Ortiz 2025-04-10 11:01:51 -05:00
  • 1a76c55df4 fix: Use NAMESPACE global variable Jash Gulabrai 2025-04-10 11:31:56 -04:00
  • 84e85e824a Add high-level instructions Jash Gulabrai 2025-04-10 11:14:17 -04:00
  • 7faec2380a Clear notebook output Jash Gulabrai 2025-04-10 10:58:11 -04:00
  • a671b33589 Add back Guardrails section Jash Gulabrai 2025-04-10 10:57:25 -04:00
  • 76aa2782a8 fix: Fix URL path in POST request helper Jash Gulabrai 2025-04-10 10:29:03 -04:00
  • ed1b24f59a doc: Updating background color for code in darkmode Francisco Javier Arceo 2025-04-10 09:20:34 -04:00
  • d8ccc32d67 1) Recreate the agent upon a change in the settings. 2) When mid-session, disable the widgets triggering the change in the settings. ilya-kolchinsky 2025-04-10 13:31:32 +02:00
  • ec9e4116d5
    docs: fix model name Sébastien Han 2025-04-10 12:10:57 +02:00
  • 609a8d63d9
    fix: use torchao 0.8.0 for inference Sébastien Han 2025-04-10 10:35:13 +02:00
  • 1be66d754e
    docs: Redirect instructions for additional hardware accelerators for remote vLLM provider (#1923) Yuan Tang 2025-04-10 04:04:17 -04:00
  • be527ba711 move models, model display name, case, reorg config Eric Huang 2025-04-09 22:56:01 -07:00
  • 33117e3012 Updated CoreModelId to get from sku_types Sajikumar JS 2025-04-10 10:17:43 +05:30
  • 47d919333a Merge branch 'main' into add-watsonx-inference-adapter Sajikumar JS 2025-04-10 10:17:08 +05:30
  • 5ffe6cee36 small copy change Francisco Javier Arceo 2025-04-09 23:17:29 -04:00
  • 5bdd767e8d added tabs for the tutorial output and rephrased thing based on feedback Francisco Javier Arceo 2025-04-09 23:16:52 -04:00
  • 57813f5606 Updates to notebook; use direct requests to NeMo where needed Jash Gulabrai 2025-04-09 23:03:34 -04:00
  • 098a09cfa3
    docs: Redirect instructions for additional hardware accelerators for remote vLLM Yuan Tang 2025-04-09 21:18:28 -04:00
  • 6a5b73ca7c feat(agents): add agent naming functionality reluctantfuturist 2025-04-09 16:22:00 -07:00
  • 712c6758c6
    docs: Avoid bash script syntax highlighting for dark mode (#1918) Yuan Tang 2025-04-09 18:43:43 -04:00
  • 36a31fe5dd
    fix: on-the-fly int4 quantize parameter (#1920) Jiawen Liu 2025-04-09 15:00:12 -07:00
  • 5ecedc12e7 clean up jiawenliu64 2025-04-09 14:55:30 -07:00
  • 8f5cd49159 vllm prompt_logprobs can also be 0 Ben Browning 2025-04-09 17:32:03 -04:00
  • 0961987962 adding server example back in and restructuring steps Francisco Javier Arceo 2025-04-09 17:01:52 -04:00
  • c8a0b110c0 fix: on-the-fly int4 quantize parameter jiawenliu64 2025-04-09 13:35:11 -07:00
  • 8d10556ce3 Add basic tests for OpenAI Chat Completions API Ben Browning 2025-04-09 16:18:13 -04:00
  • 7840a53a12 fix: Fix paths in Eval helper functions; update ModelCandidate to support Evals that use chat datasets Jash Gulabrai 2025-04-09 15:48:46 -04:00
  • ac5dc8fae2 Add prompt_logprobs and guided_choice to OpenAI completions Ben Browning 2025-04-09 15:43:53 -04:00
  • ef684ff178 Fix openai_completion tests for ollama Ben Browning 2025-04-09 15:22:52 -04:00
  • 52b4766949 Start some integration tests with an OpenAI client Ben Browning 2025-04-09 13:55:34 -04:00
  • a1e9cff37c Update spec with latest changes as well Ben Browning 2025-04-09 10:08:10 -04:00
  • fcdeb3d7bf OpenAI completion prompt can also include tokens Ben Browning 2025-04-09 10:05:50 -04:00
  • a6cf8fa12b OpenAI completion prompt can also be an array Ben Browning 2025-04-09 09:28:50 -04:00
  • 24cfa1ef1a Mark inline vllm as OpenAI unsupported inference Ben Browning 2025-04-09 08:36:01 -04:00
  • de01b1455b Passthrough inference support for OpenAI-compatible APIs Ben Browning 2025-04-09 08:35:36 -04:00
  • 15d37fde19 Add unsupported OpenAI mixin to all remaining inference providers Ben Browning 2025-04-08 12:50:23 -04:00
  • 00c4493bda OpenAI-compatible completions and chats for litellm and together Ben Browning 2025-04-08 12:35:16 -04:00
  • 1dbdff1496 ollama OpenAI-compatible completions and chat completions Ben Browning 2025-04-08 09:29:49 -04:00
  • 5bc5fed6df Clean up some more usage of direct OpenAI types Ben Browning 2025-04-08 09:10:52 -04:00
  • 92fdf6d0c9 Use our own pydantic models for OpenAI Server APIs Ben Browning 2025-04-08 09:01:35 -04:00
  • a193c9fc3f Add OpenAI-Compatible models, completions, chat/completions endpoints Ben Browning 2025-04-07 21:27:06 -04:00
  • 662483f360 moved the existing quickstart page to detailed tutorial and made an even shorter quickstart to highlight value in as few lines of code as possible Francisco Javier Arceo 2025-04-09 14:52:28 -04:00
  • e2299291c4
    fix: Mirror llama4 rope scaling fixes, small model simplify (#1917) Ashwin Bharambe 2025-04-09 11:28:45 -07:00
  • 770b38f8b5
    chore: simplify running the demo UI (#1907) Sébastien Han 2025-04-09 20:22:29 +02:00