Ashwin Bharambe
|
ed899a5dec
|
Convert TGI to work with openai_compat
|
2024-10-08 17:23:42 -07:00 |
|
Ashwin Bharambe
|
05e73d12b3
|
introduce openai_compat with the completions (not chat-completions) API
This keeps the prompt encoding layer in our control (see
`chat_completion_request_to_prompt()` method)
|
2024-10-08 17:23:42 -07:00 |
|
Ashwin Bharambe
|
f8752ab8dc
|
weaviate fixes, test now passes
|
2024-10-08 17:23:02 -07:00 |
|
Ashwin Bharambe
|
4ab6e1b81a
|
Add really basic testing for memory API
weaviate does not work; the cluster URL seems malformed
|
2024-10-08 17:23:02 -07:00 |
|
Ashwin Bharambe
|
dba7caf1d0
|
Fix fireworks and update the test
Don't look for eom_id / eot_id sadly since providers don't return the
last token
|
2024-10-08 17:23:02 -07:00 |
|
Ashwin Bharambe
|
bbd3a02615
|
Make Together inference work using the raw completions API
|
2024-10-08 17:23:02 -07:00 |
|
Ashwin Bharambe
|
3ae2b712e8
|
Add inference test
Run it as:
```
PROVIDER_ID=test-remote \
PROVIDER_CONFIG=$PWD/llama_stack/providers/tests/inference/provider_config_example.yaml \
pytest -s llama_stack/providers/tests/inference/test_inference.py \
--tb=auto \
--disable-warnings
```
|
2024-10-08 17:23:02 -07:00 |
|