Commit graph

10 commits

Author SHA1 Message Date
Ashwin Bharambe
b55034c0de Another round of simplification and clarity for models/shields/memory_banks stuff 2024-10-09 19:19:26 -07:00
Ashwin Bharambe
f40cd62306 Test fixes 2024-10-08 17:23:42 -07:00
Ashwin Bharambe
640c5c54f7 rename augment_messages 2024-10-08 17:23:42 -07:00
Ashwin Bharambe
ed899a5dec Convert TGI to work with openai_compat 2024-10-08 17:23:42 -07:00
Ashwin Bharambe
05e73d12b3 introduce openai_compat with the completions (not chat-completions) API
This keeps the prompt encoding layer in our control (see
`chat_completion_request_to_prompt()` method)
2024-10-08 17:23:42 -07:00
Ashwin Bharambe
f8752ab8dc weaviate fixes, test now passes 2024-10-08 17:23:02 -07:00
Ashwin Bharambe
4ab6e1b81a Add really basic testing for memory API
weaviate does not work; the cluster URL seems malformed
2024-10-08 17:23:02 -07:00
Ashwin Bharambe
dba7caf1d0 Fix fireworks and update the test
Don't look for eom_id / eot_id sadly since providers don't return the
last token
2024-10-08 17:23:02 -07:00
Ashwin Bharambe
bbd3a02615 Make Together inference work using the raw completions API 2024-10-08 17:23:02 -07:00
Ashwin Bharambe
3ae2b712e8 Add inference test
Run it as:

```
PROVIDER_ID=test-remote \
 PROVIDER_CONFIG=$PWD/llama_stack/providers/tests/inference/provider_config_example.yaml \
 pytest -s llama_stack/providers/tests/inference/test_inference.py \
 --tb=auto \
 --disable-warnings
```
2024-10-08 17:23:02 -07:00