forked from phoenix-oss/llama-stack-mirror
* fix non-streaming api in inference server * unit test for inline inference * Added non-streaming ollama inference impl * add streaming support for ollama inference with tests * addressing comments --------- Co-authored-by: Hardik Shah <hjshah@fb.com>
32 lines
346 B
Text
32 lines
346 B
Text
accelerate
|
|
black==24.4.2
|
|
blobfile
|
|
codeshield
|
|
fairscale
|
|
fastapi
|
|
fire
|
|
flake8
|
|
httpx
|
|
huggingface-hub
|
|
hydra-core
|
|
hydra-zen
|
|
json-strong-typing
|
|
llama-models
|
|
matplotlib
|
|
ollama
|
|
omegaconf
|
|
pandas
|
|
Pillow
|
|
pre-commit
|
|
pydantic==1.10.13
|
|
pydantic_core==2.18.2
|
|
python-dotenv
|
|
python-openapi
|
|
requests
|
|
tiktoken
|
|
torch
|
|
transformers
|
|
ufmt==2.7.0
|
|
usort==1.0.8
|
|
uvicorn
|
|
zmq
|