llama-stack/llama_toolchain/inference/adapters/tgi
Celina Hanouti 736092f6bc
[Inference] Use huggingface_hub inference client for TGI adapter (#53)
* Use huggingface_hub inference client for TGI inference

* Update the default value for TGI URL

* Use InferenceClient.text_generation for TGI inference

* Fixes post-review and split TGI adapter into local and Inference Endpoints ones

* Update CLI reference and add typing

* Rename TGI Adapter class

* Use HfApi to get the namespace when not provide in the hf endpoint name

* Remove unecessary method argument

* Improve TGI adapter initialization condition

* Move helper into impl file + fix merging conflicts
2024-09-12 09:11:35 -07:00
..
__init__.py [Inference] Use huggingface_hub inference client for TGI adapter (#53) 2024-09-12 09:11:35 -07:00
config.py [Inference] Use huggingface_hub inference client for TGI adapter (#53) 2024-09-12 09:11:35 -07:00
tgi.py [Inference] Use huggingface_hub inference client for TGI adapter (#53) 2024-09-12 09:11:35 -07:00