forked from phoenix-oss/llama-stack-mirror
* Use huggingface_hub inference client for TGI inference * Update the default value for TGI URL * Use InferenceClient.text_generation for TGI inference * Fixes post-review and split TGI adapter into local and Inference Endpoints ones * Update CLI reference and add typing * Rename TGI Adapter class * Use HfApi to get the namespace when not provide in the hf endpoint name * Remove unecessary method argument * Improve TGI adapter initialization condition * Move helper into impl file + fix merging conflicts |
||
---|---|---|
.. | ||
agentic_system | ||
batch_inference | ||
cli | ||
common | ||
core | ||
dataset/api | ||
evaluations/api | ||
inference | ||
memory | ||
models/api | ||
post_training/api | ||
reward_scoring/api | ||
safety | ||
synthetic_data_generation/api | ||
telemetry | ||
tools | ||
__init__.py | ||
stack.py |