llama-stack

History

Ashwin Bharambe 09b793c4d6 Fix fp8 implementation which had bit-rotten a bit I only tested with "on-the-fly" bf16 -> fp8 conversion, not the "load from fp8" codepath. YAML I tested with: ``` providers: - provider_id: quantized provider_type: meta-reference-quantized config: model: Llama3.1-8B-Instruct quantization: type: fp8 ```		2024-10-15 13:57:01 -07:00
..
agents	fix agents context retriever	2024-10-10 20:17:29 -07:00
codeshield	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
inference	Fix fp8 implementation which had bit-rotten a bit	2024-10-15 13:57:01 -07:00
memory	Remove "routing_table" and "routing_key" concepts for the user (#201 )	2024-10-10 10:24:13 -07:00
safety	Fix incorrect completion() signature for Databricks provider (#236 )	2024-10-11 08:47:57 -07:00
telemetry	API Updates (#73 )	2024-09-17 19:51:35 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00