Ishaan Jaff
|
1b2ed0c344
|
[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337)
* migrate triton to base llm http handler
* clean up triton handler.py
* use transform functions for triton
* add TritonConfig
* get openai params for triton
* use triton embedding config
* test_completion_triton_generate_api
* test_completion_triton_infer_api
* fix TritonConfig doc string
* use TritonResponseIterator
* fix triton embeddings
* docs triton chat usage
|
2024-12-20 20:59:40 -08:00 |
|