llama-stack-mirror/llama_stack/providers/inline
Ashwin Bharambe ade075152e
chore: kill inline::vllm (#2824)
Inline _inference_ providers haven't proved to be very useful -- they
are rarely used. And for good reason -- it is almost never a good idea
to include a complex (distributed) inference engine bundled into a
distributed stateful front-end server serving many other things.
Responsibility should be split properly.

See Discord discussion:
1395849853
2025-07-18 15:52:18 -07:00
..
agents chore(api): add mypy coverage to meta_reference_safety (#2661) 2025-07-09 10:22:34 +02:00
datasetio chore(refact): move paginate_records fn outside of datasetio (#2137) 2025-05-12 10:56:14 -07:00
eval chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
files/localfs fix: add shutdown function for localfs provider (#2781) 2025-07-16 08:24:57 -07:00
inference chore: kill inline::vllm (#2824) 2025-07-18 15:52:18 -07:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training chore: add mypy post training (#2675) 2025-07-09 15:44:39 +02:00
safety ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
scoring fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
telemetry feat: improve telemetry (#2590) 2025-07-04 17:29:09 +02:00
tool_runtime feat: Add ChunkMetadata to Chunk (#2497) 2025-06-25 15:55:23 -04:00
vector_io fix: SQLiteVecIndex.create(..., bank_id="test_bank.123") - bank_id with a dot - leads to sqlite3.OperationalError (#2770) (#2771) 2025-07-16 08:25:44 -07:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00