Xi Yan
3c72c034e6
[remove import *] clean up import *'s ( #689 )
...
# What does this PR do?
- as title, cleaning up `import *`'s
- upgrade tests to make them more robust to bad model outputs
- remove import *'s in llama_stack/apis/* (skip __init__ modules)
<img width="465" alt="image"
src="https://github.com/user-attachments/assets/d8339c13-3b40-4ba5-9c53-0d2329726ee2 "
/>
- run `sh run_openapi_generator.sh`, no types gets affected
## Test Plan
### Providers Tests
**agents**
```
pytest -v -s llama_stack/providers/tests/agents/test_agents.py -m "together" --safety-shield meta-llama/Llama-Guard-3-8B --inference-model meta-llama/Llama-3.1-405B-Instruct-FP8
```
**inference**
```bash
# meta-reference
torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py
torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py
# together
pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.1-8B-Instruct" ./llama_stack/providers/tests/inference/test_text_inference.py
pytest -v -s -k "together" --inference-model="meta-llama/Llama-3.2-11B-Vision-Instruct" ./llama_stack/providers/tests/inference/test_vision_inference.py
pytest ./llama_stack/providers/tests/inference/test_prompt_adapter.py
```
**safety**
```
pytest -v -s llama_stack/providers/tests/safety/test_safety.py -m together --safety-shield meta-llama/Llama-Guard-3-8B
```
**memory**
```
pytest -v -s llama_stack/providers/tests/memory/test_memory.py -m "sentence_transformers" --env EMBEDDING_DIMENSION=384
```
**scoring**
```
pytest -v -s -m llm_as_judge_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py --judge-model meta-llama/Llama-3.2-3B-Instruct
pytest -v -s -m basic_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py
pytest -v -s -m braintrust_scoring_together_inference llama_stack/providers/tests/scoring/test_scoring.py
```
**datasetio**
```
pytest -v -s -m localfs llama_stack/providers/tests/datasetio/test_datasetio.py
pytest -v -s -m huggingface llama_stack/providers/tests/datasetio/test_datasetio.py
```
**eval**
```
pytest -v -s -m meta_reference_eval_together_inference llama_stack/providers/tests/eval/test_eval.py
pytest -v -s -m meta_reference_eval_together_inference_huggingface_datasetio llama_stack/providers/tests/eval/test_eval.py
```
### Client-SDK Tests
```
LLAMA_STACK_BASE_URL=http://localhost:5000 pytest -v ./tests/client-sdk
```
### llama-stack-apps
```
PORT=5000
LOCALHOST=localhost
python -m examples.agents.hello $LOCALHOST $PORT
python -m examples.agents.inflation $LOCALHOST $PORT
python -m examples.agents.podcast_transcript $LOCALHOST $PORT
python -m examples.agents.rag_as_attachments $LOCALHOST $PORT
python -m examples.agents.rag_with_memory_bank $LOCALHOST $PORT
python -m examples.safety.llama_guard_demo_mm $LOCALHOST $PORT
python -m examples.agents.e2e_loop_with_custom_tools $LOCALHOST $PORT
# Vision model
python -m examples.interior_design_assistant.app
python -m examples.agent_store.app $LOCALHOST $PORT
```
### CLI
```
which llama
llama model prompt-format -m Llama3.2-11B-Vision-Instruct
llama model list
llama stack list-apis
llama stack list-providers inference
llama stack build --template ollama --image-type conda
```
### Distributions Tests
**ollama**
```
llama stack build --template ollama --image-type conda
ollama run llama3.2:1b-instruct-fp16
llama stack run ./llama_stack/templates/ollama/run.yaml --env INFERENCE_MODEL=meta-llama/Llama-3.2-1B-Instruct
```
**fireworks**
```
llama stack build --template fireworks --image-type conda
llama stack run ./llama_stack/templates/fireworks/run.yaml
```
**together**
```
llama stack build --template together --image-type conda
llama stack run ./llama_stack/templates/together/run.yaml
```
**tgi**
```
llama stack run ./llama_stack/templates/tgi/run.yaml --env TGI_URL=http://0.0.0.0:5009 --env INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
```
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-12-27 15:45:44 -08:00
Xi Yan
e8112b31ab
move hf addapter->remote ( #459 )
...
# What does this PR do?
- move folder
## Test Plan
**Unit Test**
```
pytest -v -s -m "huggingface" datasetio/test_datasetio.py
```
**E2E**
```
llama stack run
```
```
llama-stack-client eval run_benchmark meta-reference-mmlu --num-examples 5 --output-dir ./ --eval-task-config ~/eval_task_config.json --visualize
```
<img width="657" alt="image"
src="https://github.com/user-attachments/assets/63d53f9d-6c7e-4667-af8c-9d16c91ae6e3 ">
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-14 22:41:19 -05:00
Xi Yan
c29fa56dde
add inline:: prefix for localfs provider ( #441 )
...
# What does this PR do?
- add inline:: prefix for localfs provider
## Test Plan
```
llama stack run
datasetio:
- provider_id: localfs-0
provider_type: inline::localfs
config: {}
```
```
pytest -v -s -m meta_reference_eval_fireworks_inference eval/test_eval.py
pytest -v -s -m localfs datasetio/test_datasetio.py
```
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md ),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-13 10:44:39 -05:00
Xi Yan
b4416b72fd
Folder restructure for evals/datasets/scoring ( #419 )
...
* rename evals related stuff
* fix datasetio
* fix scoring test
* localfs -> LocalFS
* refactor scoring
* refactor scoring
* remove 8b_correctness scoring_fn from tests
* tests w/ eval params
* scoring fn braintrust fixture
* import
2024-11-11 17:35:40 -05:00
Xi Yan
2b7d70ba86
[Evals API][11/n] huggingface dataset provider + mmlu scoring fn ( #392 )
...
* wip
* scoring fn api
* eval api
* eval task
* evaluate api update
* pre commit
* unwrap context -> config
* config field doc
* typo
* naming fix
* separate benchmark / app eval
* api name
* rename
* wip tests
* wip
* datasetio test
* delete unused
* fixture
* scoring resolve
* fix scoring register
* scoring test pass
* score batch
* scoring fix
* fix eval
* test eval works
* huggingface provider
* datasetdef files
* mmlu scoring fn
* test wip
* remove type ignore
* api refactor
* add default task_eval_id for routing
* add eval_id for jobs
* remove type ignore
* huggingface provider
* wip huggingface register
* only keep 1 run_eval
* fix optional
* register task required
* register task required
* delete old tests
* fix
* mmlu loose
* refactor
* msg
* fix tests
* move benchmark task def to file
* msg
* gen openapi
* openapi gen
* move dataset to hf llamastack repo
* remove todo
* refactor
* add register model to unit test
* rename
* register to client
* delete preregistered dataset/eval task
* comments
* huggingface -> remote adapter
* openapi gen
2024-11-11 14:49:50 -05:00
Ashwin Bharambe
994732e2e0
impls
-> inline
, adapters
-> remote
(#381 )
2024-11-06 14:54:05 -08:00
Xi Yan
821810657f
[Evals API][2/n] datasets / datasetio meta-reference implementation ( #288 )
...
* skeleton dataset / datasetio
* dataset datasetio
* config
* address comments
* delete dataset_utils
* address comments
* naming fix
2024-10-22 16:12:16 -07:00