llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 19:04:19 +00:00

History

Celina Hanouti 736092f6bc [Inference] Use huggingface_hub inference client for TGI adapter (#53 ) * Use huggingface_hub inference client for TGI inference * Update the default value for TGI URL * Use InferenceClient.text_generation for TGI inference * Fixes post-review and split TGI adapter into local and Inference Endpoints ones * Update CLI reference and add typing * Rename TGI Adapter class * Use HfApi to get the namespace when not provide in the hf endpoint name * Remove unecessary method argument * Improve TGI adapter initialization condition * Move helper into impl file + fix merging conflicts	2024-09-12 09:11:35 -07:00
..
cli_reference.md	[Inference] Use huggingface_hub inference client for TGI adapter (#53 )	2024-09-12 09:11:35 -07:00
license_header.txt	Initial commit	2024-07-23 08:32:33 -07:00

[Inference] Use huggingface_hub inference client for TGI adapter (#53 )

* Use huggingface_hub inference client for TGI inference

* Update the default value for TGI URL

* Use InferenceClient.text_generation for TGI inference

* Fixes post-review and split TGI adapter into local and Inference Endpoints ones

* Update CLI reference and add typing

* Rename TGI Adapter class

* Use HfApi to get the namespace when not provide in the hf endpoint name

* Remove unecessary method argument

* Improve TGI adapter initialization condition

* Move helper into impl file + fix merging conflicts

2024-09-12 09:11:35 -07:00

cli_reference.md

[Inference] Use huggingface_hub inference client for TGI adapter (#53 )

2024-09-12 09:11:35 -07:00

license_header.txt

Initial commit

2024-07-23 08:32:33 -07:00