Commit graph

  • 985ff4d6ce update distributions/readmes Xi Yan 2024-10-28 15:10:40 -07:00
  • 7b8748c53e
    [Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330) Xi Yan 2024-10-28 14:08:42 -07:00
  • 04a4784287
    Update README.md Xi Yan 2024-10-28 13:25:44 -07:00
  • 3fa1eaf37d
    Update README.md Xi Yan 2024-10-28 13:18:55 -07:00
  • 0d4215e125
    Update README.md Xi Yan 2024-10-28 13:18:34 -07:00
  • 8f5a850de9
    Update README.md Xi Yan 2024-10-28 13:16:23 -07:00
  • ffb3965ade remove Field for return_type Xi Yan 2024-10-28 13:04:41 -07:00
  • b7d2b83d55 Allow passing provider_registry to resolve_impls() Ashwin Bharambe 2024-10-28 11:58:16 -07:00
  • 8a3b64d1be Bump version to 0.0.47 Ashwin Bharambe 2024-10-27 22:30:38 -07:00
  • 46bb8884a7 distributions readme typos Xi Yan 2024-10-27 11:57:21 -07:00
  • 44c05c6e7d add vision instruct models for fireworks Dalton Flanagan 2024-10-27 17:54:54 +00:00
  • 9b85d9a841
    completion() for fireworks (#329) Dinesh Yeduguru 2024-10-25 16:12:10 -07:00
  • 7ec79f3b9d
    completion() for together (#324) Dinesh Yeduguru 2024-10-25 14:21:12 -07:00
  • 8a74e400d6
    Update getting_started.md Xi Yan 2024-10-25 13:30:33 -07:00
  • f168752bba
    Update getting_started.md Xi Yan 2024-10-25 13:27:43 -07:00
  • abdf7cddf3
    [Evals API][4/n] evals with generation meta-reference impl (#303) Xi Yan 2024-10-25 13:12:39 -07:00
  • 426d821e7f Bump version to 0.0.46 Ashwin Bharambe 2024-10-25 13:10:55 -07:00
  • c05fbf14b3
    Added hadamard transform for spinquant (#326) Sachin Mehta 2024-10-25 12:58:48 -07:00
  • 07f9bf723f
    fix broken --list-templates with adding build.yaml files for packaging (#327) Xi Yan 2024-10-25 12:51:22 -07:00
  • afae4e3d8e Update docker build flow a little Ashwin Bharambe 2024-10-25 10:06:21 -07:00
  • 5bed6c276c Move function around Ashwin Bharambe 2024-10-25 09:17:33 -07:00
  • a387ca22e2 Update docker_base for meta-reference-gpu Ashwin Bharambe 2024-10-25 09:13:33 -07:00
  • 70d59b0f5d Make vllm inference better Ashwin Bharambe 2024-10-24 22:30:49 -07:00
  • cb43caa2c3 start_container.sh prefix llamastack->distribution name Xi Yan 2024-10-24 21:29:07 -07:00
  • df141b6ef3
    Fix for get_agents_session (#300) Sarthak Deshpande 2024-10-25 07:06:27 +05:30
  • b6d8246b82
    added templates and enhanced readme (#307) Justin Lee 2024-10-24 17:07:06 -07:00
  • 3e1c3fdb3f
    completion() for tgi (#295) Dinesh Yeduguru 2024-10-24 16:02:41 -07:00
  • cb84034567
    [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) Xi Yan 2024-10-24 14:52:30 -07:00
  • e70420a06e
    Update getting_started.md Xi Yan 2024-10-24 14:19:35 -07:00
  • 8615bc9e08 update manifest for build templates Xi Yan 2024-10-24 14:04:13 -07:00
  • 94728d6983 Handle both ipv6 and ipv4 interfaces together Ashwin Bharambe 2024-10-24 13:36:41 -07:00
  • 0538cc297e Bump version to 0.0.45 Ashwin Bharambe 2024-10-24 12:14:18 -07:00
  • 205bcfdd4e Fix score threshold in faiss Ashwin Bharambe 2024-10-24 12:11:58 -07:00
  • 161aef0aae Small updates to quantization config Ashwin Bharambe 2024-10-24 12:08:43 -07:00
  • 8eceebec98
    Update iOS inference instructions for new quantization Dalton Flanagan 2024-10-24 14:47:27 -04:00
  • 8aa8847b4a Bump version to 0.0.44 Ashwin Bharambe 2024-10-24 08:41:39 -07:00
  • 7afe51c84d
    New quantized models (#301) Ashwin Bharambe 2024-10-24 08:38:56 -07:00
  • 05a8d47b98 Add a meta-reference-quantized-gpu distribution Ashwin Bharambe 2024-10-23 19:33:14 -07:00
  • f5dcc03742 use pytorch/pytorch as base Xi Yan 2024-10-23 20:22:00 -07:00
  • 0cec86453b
    Fix issue w/ routing_table api getting added when router api is not specified (#298) Xi Yan 2024-10-23 15:27:22 -07:00
  • 21f2e9adf5
    dont set num_predict for all providers (#294) Dinesh Yeduguru 2024-10-23 11:44:04 -07:00
  • ffb561070d
    Support structured output for Together (#289) Ashwin Bharambe 2024-10-22 22:36:38 -07:00
  • 2e5e46d896
    Added tests for persistence (#274) Sarthak Deshpande 2024-10-23 08:11:46 +05:30
  • 821810657f
    [Evals API][2/n] datasets / datasetio meta-reference implementation (#288) Xi Yan 2024-10-22 16:12:16 -07:00
  • 8a01b9e40c
    Added implementations for get_agents_session, delete_agents_session and delete_agents (#267) Sarthak Deshpande 2024-10-23 02:20:43 +05:30
  • b81a3bd46a
    Fix import conflict for SamplingParams (#285) Suraj Subramanian 2024-10-22 15:56:00 -04:00
  • c06718fbd5
    Add support for Structured Output / Guided decoding (#281) Ashwin Bharambe 2024-10-22 12:53:34 -07:00
  • 4c3d33e6f4
    feat: Qdrant Vector index support (#221) Anush 2024-10-23 01:20:19 +05:30
  • 668a495aba
    Add REST api example for chat_completion (#286) Suraj Subramanian 2024-10-22 13:35:20 -04:00
  • e45f121c77
    [Evals API] [1/n] Initial API (#287) Xi Yan 2024-10-22 09:31:19 -07:00
  • b279d3bc58
    Update README.md Xi Yan 2024-10-22 08:01:33 -07:00
  • 1d241bf3fe
    add completion() for ollama (#280) Dinesh Yeduguru 2024-10-21 22:26:33 -07:00
  • e2a5a2e10d
    first version of readthedocs (#278) raghotham 2024-10-22 10:15:58 +05:30
  • dbb5ce43fc Bump version to 0.0.43 Xi Yan 2024-10-21 19:10:01 -07:00
  • a2ff74a686 telemetry WARNING->WARN fix Xi Yan 2024-10-21 18:52:48 -07:00
  • b1451afbc8
    Update README.md Xi Yan 2024-10-21 18:21:30 -07:00
  • 4d2bd2d39e
    add more distro templates (#279) Xi Yan 2024-10-21 18:15:08 -07:00
  • cf27d19dd5 fix sse_generator async Xi Yan 2024-10-21 14:03:32 -07:00
  • 1944405dca
    Update new_api_provider.md Ashwin Bharambe 2024-10-21 14:02:51 -07:00
  • 606c48309e Small updates to encourage integration testing Ashwin Bharambe 2024-10-21 13:52:10 -07:00
  • cb203b14b4 update README.md Xi Yan 2024-10-21 13:51:39 -07:00
  • 3a7884345a
    Update new_api_provider.md Xi Yan 2024-10-21 13:41:56 -07:00
  • 25b37c9ff7
    Update new_api_provider.md Xi Yan 2024-10-21 13:41:46 -07:00
  • af75618348 remove distribution/templates Xi Yan 2024-10-21 13:23:58 -07:00
  • 23210e8679
    llama stack distributions / templates / docker refactor (#266) Xi Yan 2024-10-21 11:17:53 -07:00
  • c995219731
    Update event_logger.py (#275) nehal-a2z 2024-10-21 23:16:53 +05:30
  • cae5b0708b
    Create .readthedocs.yaml raghotham 2024-10-21 11:48:19 +05:30
  • a27a2cd2af
    Add vLLM inference provider for OpenAI compatible vLLM server (#178) Yuan Tang 2024-10-20 21:43:25 -04:00
  • 59c43736e8 update ollama for llama-guard3 Ashwin Bharambe 2024-10-19 17:26:18 -07:00
  • 8cfbb9d38b Improve an important error message Ashwin Bharambe 2024-10-19 17:19:54 -07:00
  • 2089427d60
    Make all methods async def again; add completion() for meta-reference (#270) Ashwin Bharambe 2024-10-18 20:50:59 -07:00
  • 95a96afe34 Small rename Ashwin Bharambe 2024-10-18 14:41:38 -07:00
  • 71a905e93f Allow overridding checkpoint_dir via config Ashwin Bharambe 2024-10-18 14:28:06 -07:00
  • 33afd34e6f
    Add an option to not use elastic agents for meta-reference inference (#269) Ashwin Bharambe 2024-10-18 12:51:10 -07:00
  • be3c5c034d
    [bugfix] fix case for agent when memory bank registered without specifying provider_id (#264) Xi Yan 2024-10-17 17:28:17 -07:00
  • 9fcf5d58e0 Allow overriding MODEL_IDS for inference test Ashwin Bharambe 2024-10-17 10:03:27 -07:00
  • 02be26098a getting started Xi Yan 2024-10-16 23:56:21 -07:00
  • cf9e5b76b2
    Update getting_started.md Xi Yan 2024-10-16 23:52:29 -07:00
  • 7cc47da8f2
    Update getting_started.md Xi Yan 2024-10-16 23:50:31 -07:00
  • d787d1e84f
    config templates restructure, docs (#262) Xi Yan 2024-10-16 23:25:10 -07:00
  • a07dfffbbf
    initial changes (#261) Tam 2024-10-16 23:15:59 -07:00
  • 319a6b5f83
    Update getting_started.md (#260) ATH 2024-10-16 21:05:36 -04:00
  • c4d5d6bb91
    Docker compose scripts for remote adapters (#241) Xi Yan 2024-10-15 16:32:53 -07:00
  • 770647dede
    Fix broken rendering in Google Colab (#247) Matthieu FRONTON 2024-10-16 00:41:49 +02:00
  • 09b793c4d6 Fix fp8 implementation which had bit-rotten a bit Ashwin Bharambe 2024-10-15 13:57:01 -07:00
  • 80ada04f76
    Remove request arg from chat completion response processing (#240) Yuan Tang 2024-10-15 16:03:17 -04:00
  • 209cd3d35e Bump version to 0.0.42 Xi Yan 2024-10-14 11:13:04 -07:00
  • a2b87ed0cb
    Switch to pre-commit/action (#239) Yuan Tang 2024-10-11 14:09:11 -04:00
  • 05282d1234
    Enable pre-commit on main branch (#237) Yuan Tang 2024-10-11 13:03:59 -04:00
  • 2128e61da2
    Fix incorrect completion() signature for Databricks provider (#236) Yuan Tang 2024-10-11 11:47:57 -04:00
  • 9fbe8852aa
    Add Swift Package Index badge Dalton Flanagan 2024-10-10 23:39:25 -04:00
  • ca29980c6b fix agents context retriever Xi Yan 2024-10-10 20:17:29 -07:00
  • 1ff0476002 Split off meta-reference-quantized provider Ashwin Bharambe 2024-10-10 15:54:08 -07:00
  • 7ff5800dea generate openapi Xi Yan 2024-10-10 15:30:34 -07:00
  • a3e65d58a9
    Add logo Dalton Flanagan 2024-10-10 15:04:21 -04:00
  • eba9d1ea14
    ci: Run pre-commit checks in CI (#176) Russell Bryant 2024-10-10 14:21:59 -04:00
  • 89d24a07f0 Bump version to 0.0.41 Ashwin Bharambe 2024-10-10 10:27:03 -07:00
  • 6bb57e72a7
    Remove "routing_table" and "routing_key" concepts for the user (#201) Ashwin Bharambe 2024-10-10 10:24:13 -07:00
  • 8c3010553f
    Fix agents path in generate.py Dalton Flanagan 2024-10-10 11:41:03 -04:00
  • 7a8aa775e5
    JSON serialization for parallel processing queue (#232) Dalton Flanagan 2024-10-09 17:24:12 -04:00