llama-stack

phoenix-oss/llama-stack

Fork 0

forked from phoenix-oss/llama-stack-mirror

Commit graph

Author	SHA1	Message	Date
Xi Yan	5712566061	Remove request wrapper migration (#64 ) * [1/n] migrate inference/chat_completion * migrate inference/completion * inference/completion * inference regenerate openapi spec * safety api * migrate agentic system * migrate apis without implementations * re-generate openapi spec * remove hack from openapi generator * fix inference * fix inference * openapi generator rerun * Simplified Telemetry API and tying it to logger (#57) * Simplified Telemetry API and tying it to logger * small update which adds a METRIC type * move span events one level down into structured log events --------- Co-authored-by: Ashwin Bharambe <ashwin@meta.com> * fix api to work with openapi generator * fix agentic calling inference * together adapter inference * update inference adapters --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> Co-authored-by: Ashwin Bharambe <ashwin@meta.com>	2024-09-12 15:03:49 -07:00
Celina Hanouti	736092f6bc	[Inference] Use huggingface_hub inference client for TGI adapter (#53 ) * Use huggingface_hub inference client for TGI inference * Update the default value for TGI URL * Use InferenceClient.text_generation for TGI inference * Fixes post-review and split TGI adapter into local and Inference Endpoints ones * Update CLI reference and add typing * Rename TGI Adapter class * Use HfApi to get the namespace when not provide in the hf endpoint name * Remove unecessary method argument * Improve TGI adapter initialization condition * Move helper into impl file + fix merging conflicts	2024-09-12 09:11:35 -07:00
Ashwin Bharambe	21bedc1596	[inference] Add a TGI adapter (#52 ) * TGI adapter and some refactoring of other inference adapters * Use the lower-level `generate_stream()` method for correct tool calling --------- Co-authored-by: Ashwin Bharambe <ashwin@meta.com>	2024-09-04 22:49:33 -07:00

Author

SHA1

Message

Date

Xi Yan

5712566061

Remove request wrapper migration (#64 )

* [1/n] migrate inference/chat_completion

* migrate inference/completion

* inference/completion

* inference regenerate openapi spec

* safety api

* migrate agentic system

* migrate apis without implementations

* re-generate openapi spec

* remove hack from openapi generator

* fix inference

* fix inference

* openapi generator rerun

* Simplified Telemetry API and tying it to logger (#57)

* Simplified Telemetry API and tying it to logger

* small update which adds a METRIC type

* move span events one level down into structured log events

---------

Co-authored-by: Ashwin Bharambe <ashwin@meta.com>

* fix api to work with openapi generator

* fix agentic calling inference

* together adapter inference

* update inference adapters

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
Co-authored-by: Ashwin Bharambe <ashwin@meta.com>

2024-09-12 15:03:49 -07:00

Celina Hanouti

736092f6bc

[Inference] Use huggingface_hub inference client for TGI adapter (#53 )

* Use huggingface_hub inference client for TGI inference

* Update the default value for TGI URL

* Use InferenceClient.text_generation for TGI inference

* Fixes post-review and split TGI adapter into local and Inference Endpoints ones

* Update CLI reference and add typing

* Rename TGI Adapter class

* Use HfApi to get the namespace when not provide in the hf endpoint name

* Remove unecessary method argument

* Improve TGI adapter initialization condition

* Move helper into impl file + fix merging conflicts

2024-09-12 09:11:35 -07:00

Ashwin Bharambe

21bedc1596

[inference] Add a TGI adapter (#52 )

* TGI adapter and some refactoring of other inference adapters

* Use the lower-level `generate_stream()` method for correct tool calling

---------

Co-authored-by: Ashwin Bharambe <ashwin@meta.com>

2024-09-04 22:49:33 -07:00

3 commits