[Inference] Use huggingface_hub inference client for TGI adapter (#53)

* Use huggingface_hub inference client for TGI inference

* Update the default value for TGI URL

* Use InferenceClient.text_generation for TGI inference

* Fixes post-review and split TGI adapter into local and Inference Endpoints ones

* Update CLI reference and add typing

* Rename TGI Adapter class

* Use HfApi to get the namespace when not provide in the hf endpoint name

* Remove unecessary method argument

* Improve TGI adapter initialization condition

* Move helper into impl file + fix merging conflicts
This commit is contained in:
Celina Hanouti 2024-09-12 18:11:35 +02:00 committed by GitHub
parent 191cd28831
commit 736092f6bc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 171 additions and 72 deletions

View file

@ -39,8 +39,9 @@ def available_providers() -> List[ProviderSpec]:
api=Api.inference,
adapter=AdapterSpec(
adapter_id="tgi",
pip_packages=["text-generation"],
pip_packages=["huggingface_hub"],
module="llama_toolchain.inference.adapters.tgi",
config_class="llama_toolchain.inference.adapters.tgi.TGIImplConfig",
),
),
remote_provider_spec(