llama-stack

forked from phoenix-oss/llama-stack-mirror

Author	SHA1	Message	Date
varunfb	f5c36c47ed	Added support for llama 3.3 model (#601 ) # What does this PR do? Llama-Stack does not support the 3.3 model. So added the support so llama-stack can do inferencing with 3.3 model.	2024-12-10 20:03:31 -08:00
Aidan Do	76eb558bde	doc: llama-stack build --config help text references old directory (#596 ) # What does this PR do? - llama-stack build --config help text references example_configs which no longer exists - Update to refer new directory format to avoid confusion ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).	2024-12-10 17:42:02 -08:00
Matthew Farrellee	e0d5be41fe	add nvidia nim inference provider to docs (#534 ) # What does this PR do? add [NVIDIA NIM](https://build.nvidia.com/nim?filters=nimType%3Anim_type_run_anywhere&q=llama) reference to the docs ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-10 13:23:56 -08:00
Xi Yan	e2054d53e4	Fix issue 586 (#594 ) # What does this PR do? - Addresses issue (#586 ) ## Test Plan ``` python llama_stack/scripts/distro_codegen.py ``` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-10 10:22:04 -08:00
Ashwin Bharambe	02b43be9d7	Bump version to 0.0.61	2024-12-10 10:18:44 -08:00
Ashwin Bharambe	fa68ded07c	Remove the unnecessary message after llama stack build	2024-12-10 09:46:56 -08:00
Dinesh Yeduguru	885bb0900b	memory retrival to print only the bytes injected	2024-12-10 09:32:18 -08:00
Dinesh Yeduguru	2e3d3a62a5	Revert "add tracing to library client (#591 )" This reverts commit `bc1fddf1df`.	2024-12-10 08:50:20 -08:00
Dinesh Yeduguru	16d103842a	Revert "await end_trace in libcli" This reverts commit `7615da78b8`.	2024-12-10 08:47:32 -08:00
Dinesh Yeduguru	f969b561ea	Revert "Disable telemetry in library client for now" This reverts commit `176ebddf47`.	2024-12-10 08:47:18 -08:00
Dinesh Yeduguru	686f8d5b8d	remove info logging in agent instance	2024-12-10 08:40:42 -08:00
Ashwin Bharambe	1ad691bb04	Bump version to 0.0.60	2024-12-09 22:19:51 -08:00
Ashwin Bharambe	176ebddf47	Disable telemetry in library client for now	2024-12-09 22:17:25 -08:00
Ashwin Bharambe	baae4f7b51	Bump version to 0.0.59	2024-12-09 21:22:20 -08:00
Ashwin Bharambe	a4d8a6009a	Fixes for library client (#587 ) Library client used _server_ side types which was no bueno. The fix here is not the completely correct fix but it is good for enough and for the demo notebook.	2024-12-09 17:14:37 -08:00
Dinesh Yeduguru	7615da78b8	await end_trace in libcli	2024-12-09 15:54:42 -08:00
Dinesh Yeduguru	bc1fddf1df	add tracing to library client (#591 )	2024-12-09 15:46:26 -08:00
Xi Yan	ab7145a04f	minor refactor	2024-12-09 15:43:12 -08:00
Xi Yan	cd40a5fdbf	update template run.yaml to include openai api key for braintrust (#590 ) # What does this PR do? Why - braintrust provider needs OpenAI API Key set in config for DirectClient to work ## Test Plan ``` python llama_stack/scripts/distro_codegen.py ``` <img width="340" alt="image" src="https://github.com/user-attachments/assets/eae38296-f880-40f0-9a9e-46a12038db64"> - set API key in client via provider_data <img width="907" alt="image" src="https://github.com/user-attachments/assets/3d74cd7c-dc7e-4a42-8a40-c22f19b0c534"> ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-09 15:40:59 -08:00
Xi Yan	c699e884b5	fix telemetry import (#585 ) # What does this PR do? fix issue <img width="921" alt="image" src="https://github.com/user-attachments/assets/26f7499f-fae1-4c93-9de3-1ae7ee7c5144"> ## Test Plan ``` llama stack run ``` <img width="657" alt="image" src="https://github.com/user-attachments/assets/266b6ac2-f991-4b38-841c-2a610b7d9f0f"> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-09 11:18:53 -08:00
Ashwin Bharambe	a2170353af	better detection for jupyter	2024-12-09 09:38:11 -08:00
Ashwin Bharambe	5335393fe3	Avoid deleting temp directory between agent turns This brings an interesting aspect -- we need to maintain session-level tempdir state (!) since the model was told there was some resource at a given location that it needs to maintain	2024-12-08 22:25:37 -08:00
Ashwin Bharambe	d7dc69c8a9	Regenerate openapi	2024-12-08 20:46:22 -08:00
Ashwin Bharambe	e951852848	Miscellaneous fixes around telemetry, library client and run yaml autogen Also add a `venv` image-type for llama stack build	2024-12-08 20:40:22 -08:00
Ashwin Bharambe	224e62290f	kill unnecessarily large imports from telemetry init	2024-12-08 16:57:16 -08:00
Ashwin Bharambe	fe249f4577	Add documentations for building applications and with some content for agentic loop	2024-12-08 16:54:02 -08:00
Yuri Shkuro	397ee71c14	Fix Jaeger instructions (#580 ) # What does this PR do? - A follow-up for #572 - The command in the original PR did not run - Remove `--set` command unnecessary since Jaeger 2.1.0 ## Test Plan ``` $ docker run --rm --name jaeger \ -p 16686:16686 -p 4318:4318 \ jaegertracing/jaeger:2.1.0 2024/12/07 19:07:13 application version: git-commit=65cff3c30823ea20d3dc48bae39d5685ae307da5, git-version=v2.1.0, build-date=2024-12-06T21:17:15Z ... ``` ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. Signed-off-by: Yuri Shkuro <github@ysh.us>	2024-12-08 15:29:53 -08:00
Aidan Do	095125e463	[#391 ] Add support for json structured output for vLLM (#528 ) # What does this PR do? Addresses issue (#391) - Adds json structured output for vLLM - Enables structured output tests for vLLM > Give me a recipe for Spaghetti Bolognaise: ```json { "recipe_name": "Spaghetti Bolognaise", "preamble": "Ah, spaghetti bolognaise - the quintessential Italian dish that fills my kitchen with the aromas of childhood nostalgia. As a child, I would watch my nonna cook up a big pot of spaghetti bolognaise every Sunday, filling our small Italian household with the savory scent of simmering meat and tomatoes. The way the sauce would thicken and the spaghetti would al dente - it was love at first bite. And now, as a chef, I want to share that same love with you, so you can recreate these warm, comforting memories at home.", "ingredients": [ "500g minced beef", "1 medium onion, finely chopped", "2 cloves garlic, minced", "1 carrot, finely chopped", " celery, finely chopped", "1 (28 oz) can whole peeled tomatoes", "1 tbsp tomato paste", "1 tsp dried basil", "1 tsp dried oregano", "1 tsp salt", "1/2 tsp black pepper", "1/2 tsp sugar", "1 lb spaghetti", "Grated Parmesan cheese, for serving", "Extra virgin olive oil, for serving" ], "steps": [ "Heat a large pot over medium heat and add a generous drizzle of extra virgin olive oil.", "Add the chopped onion, garlic, carrot, and celery and cook until the vegetables are soft and translucent, about 5-7 minutes.", "Add the minced beef and cook until browned, breaking it up with a spoon as it cooks.", "Add the tomato paste and cook for 1-2 minutes, stirring constantly.", "Add the canned tomatoes, dried basil, dried oregano, salt, black pepper, and sugar. Stir well to combine.", "Bring the sauce to a simmer and let it cook for 20-30 minutes, stirring occasionally, until the sauce has thickened and the flavors have melded together.", "While the sauce cooks, bring a large pot of salted water to a boil and cook the spaghetti according to the package instructions until al dente. Reserve 1 cup of pasta water before draining the spaghetti.", "Add the reserved pasta water to the sauce and stir to combine.", "Combine the cooked spaghetti and sauce, tossing to coat the pasta evenly.", "Serve hot, topped with grated Parmesan cheese and a drizzle of extra virgin olive oil.", "Enjoy!" ] } ``` Generated with Llama-3.2-3B-Instruct model - pretty good for a 3B parameter model 👍 ## Test Plan `pytest -v -s llama_stack/providers/tests/inference/test_text_inference.py -k llama_3b-vllm_remote` With the following setup: ```bash # Environment export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct export INFERENCE_PORT=8000 export VLLM_URL=http://localhost:8000/v1 # vLLM server sudo docker run --gpus all \ -v $STORAGE_DIR/.cache/huggingface:/root/.cache/huggingface \ --env "HUGGING_FACE_HUB_TOKEN=$(cat ~/.cache/huggingface/token)" \ -p 8000:$INFERENCE_PORT \ --ipc=host \ --net=host \ vllm/vllm-openai:v0.6.3.post1 \ --model $INFERENCE_MODEL # llama-stack server llama stack build --template remote-vllm --image-type conda && llama stack run distributions/remote-vllm/run.yaml \ --port 5001 \ --env INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct ``` Results: ``` llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_model_list[llama_3b-vllm_remote] PASSED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_completion[llama_3b-vllm_remote] SKIPPED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_completions_structured_output[llama_3b-vllm_remote] SKIPPED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[llama_3b-vllm_remote] PASSED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_structured_output[llama_3b-vllm_remote] PASSED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[llama_3b-vllm_remote] PASSED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_with_tool_calling[llama_3b-vllm_remote] PASSED llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_with_tool_calling_streaming[llama_3b-vllm_remote] PASSED ================================ 6 passed, 2 skipped, 120 deselected, 2 warnings in 13.26s ================================ ``` ## Sources - https://github.com/vllm-project/vllm/discussions/8300 - By default, vLLM uses https://github.com/dottxt-ai/outlines for structured outputs [[1](`32e7db2536/vllm/engine/arg_utils.py (L279-L280)`)] ## Before submitting [N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case) - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? [N/A?] Updated relevant documentation. Couldn't find any relevant documentation. Lmk if I've missed anything. - [x] Wrote necessary unit or integration tests.	2024-12-08 15:02:51 -08:00
Jeff Tang	69a2d7b264	Use customtool's get_tool_definition to remove duplication (#584 ) # What does this PR do? Current examples would cause a lot of unnecessary painful duplication when a bunch of custom tools are expected while dealing with a real use case. Also added pip install -U httpx==0.27.2 to avoid a [httpx proxies error](https://github.com/meta-llama/llama-stack-apps/issues/131) when running in an env with 0.28 or higher of httpx installed by default. In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-08 15:00:41 -08:00
Ashwin Bharambe	1274fa4c0d	Add documentations for building applications and with some content for agentic loop	2024-12-08 14:56:37 -08:00
Henry Tu	a29013112f	Update integration type for Cerebras to hosted (#583 ) # What does this PR do? I think I misunderstood the meaning of “single node” when describing the type of the Cerebras integration. It should be hosted instead of single node as the inference is done via API call. cc: @ashwinb @raghotham - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-07 22:42:07 -08:00
Ashwin Bharambe	14f973a64f	Make LlamaStackLibraryClient work correctly (#581 ) This PR does a few things: - it moves "direct client" to llama-stack repo instead of being in the llama-stack-client-python repo - renames it to `LlamaStackLibraryClient` - actually makes synchronous generators work - makes streaming and non-streaming work properly In many ways, this PR makes things finally "work" ## Test Plan See a `library_client_test.py` I added. This isn't really quite a test yet but it demonstrates that this mode now works. Here's the invocation and the response: ``` INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct python llama_stack/distribution/tests/library_client_test.py ollama ``` ![image](https://github.com/user-attachments/assets/17d4e116-4457-4755-a14e-d9a668801fe0)	2024-12-07 14:59:36 -08:00
Riandy	b3cb8eaa38	Bump kotlin docs to 0.0.54.1 (#579 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. Updating the kotlin docs to refer to version 0.0.54.1 of the SDK instead of 0.0.54 because we discovered a bug in 0.0.54 where local module as a dependencies are not included automatically. See `593ed21d5f` ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. docs changes. Changes are tested on the llama stack apps side separately and verified to be working ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [x] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-06 14:45:29 -08:00
Riandy	e4a2948684	Update android_sdk.md (#578 ) Fix images URL and replacing todo. Previous commit missed that # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. - [ ] Addresses issue (#issue) ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-06 12:53:28 -08:00
Riandy	09fbf2d786	Add kotlin docs (#568 ) # What does this PR do? In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue. Docs update for Kotlin SDK release ## Test Plan Please describe: - tests you ran to verify your changes with result summaries. - provide instructions so it can be reproduced. ## Sources Please link relevant resources if necessary. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-06 12:03:59 -08:00
Aidan Do	0cb996c18d	doc: quickstart guide errors (#575 ) # What does this PR do? Addresses a few errors I got when running the quick start guide: https://llama-stack.readthedocs.io/en/latest/getting_started/index.html. We should keep this up to date to maintain engagement with the community. I've annotated the PR below. Could you PTAL 🙏 ? ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).	2024-12-06 12:03:31 -08:00
Dinesh Yeduguru	c543bc0745	Console span processor improvements (#577 ) Makes the console span processor output spans in less prominent way and highlight the logs based on severity. ![Screenshot 2024-12-06 at 11 26 46 AM](https://github.com/user-attachments/assets/c3a1b051-85db-4b71-b7a5-7bab5a26f072)	2024-12-06 11:46:16 -08:00
Ashwin Bharambe	084ec337af	Small cleanup of console logs	2024-12-06 10:29:24 -08:00
Dinesh Yeduguru	cb9e9048e7	add telemetry docs (#572 ) Add an experimental section and telemetry doc ![Screenshot 2024-12-05 at 10 22 51 AM](https://github.com/user-attachments/assets/b8b7a982-b800-4069-a4d0-481fc300b336) --------- Co-authored-by: Adrian Cole <64215+codefromthecrypt@users.noreply.github.com>	2024-12-06 10:17:11 -08:00
Adrian Cole	27a27152cd	Renames otel config from jaeger to otel (#569 ) # What does this PR do? #525 introduced a telemetry configuration named jaeger, but what it really is pointing to is an OTLP HTTP endpoint which is supported by most servers in the ecosystem, including raw opentelemetry collectors, several APMs, and even https://github.com/ymtdzzz/otel-tui I chose to rename this to "otel" as it will bring in more people to the ecosystem vs feeling it only works with jaeger. Later, we can use the [standard ENV](https://opentelemetry.io/docs/specs/otel/protocol/exporter/) to configure this if we like so that you can override things with variables people might expect. Note: I also added to the README that you have to install conda. Depending on experience level of the user, and especially with miniforge vs other ways, I felt this helps. ## Test Plan I would like to test this, but actually got a little lost. The previous PRs referenced yaml which doesn't seem published anywhere. It would be nice to have a pre-canned setup that uses ollama and turns on otel, but would also appreciate a hand on instructions meanwhile. ## Sources https://github.com/meta-llama/llama-stack/pull/525 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Ran pre-commit to handle lint / formatting issues. - [x] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Adrian Cole <adrian.cole@elastic.co>	2024-12-06 10:16:42 -08:00
Ashwin Bharambe	2c5c73f7ca	Bump version to 0.0.58	2024-12-06 08:36:00 -08:00
Ashwin Bharambe	66d8f4ffd1	Move the telemetry util import to be more lazy	2024-12-05 21:51:47 -08:00
Ashwin Bharambe	392be5f6dc	Reduce log volume a bit, needs more work	2024-12-05 21:40:21 -08:00
Dinesh Yeduguru	c23363d561	Add ability to query and export spans to dataset (#574 ) This PR adds two new methods to the telemetry API: 1) Gives the ability to query spans directly instead of first querying traces and then using that to get spans 2) Another method save_spans_to_dataset, which builds on the query spans to save it on dataset. This give the ability to saves spans that are part of an agent session to a dataset. The unique aspect of this API is that we dont require each provider of telemetry to implement this method. Hence, its implemented in the protocol class itself. This required the protocol check to be slightly modified.	2024-12-05 21:07:30 -08:00
Ashwin Bharambe	cdfc98cf08	add a warning at least for when `bwrap` is not available for code execution	2024-12-05 20:54:28 -08:00
Ashwin Bharambe	66440e2c20	Add missing init file	2024-12-05 17:44:14 -08:00
Xi Yan	7301403ce3	Add eval/scoring/datasetio API providers to distribution templates & UI developer guide (#564 ) # What does this PR do? - add /eval, /scoring, /datasetio API providers to distribution templates - regenerate build.yaml / run.yaml files - fix `template.py` to take in list of providers instead of only first one - override memory provider as faiss default for all distro (as only 1 memory provider is needed to start basic flow, chromadb/pgvector need additional setup step). ``` python llama_stack/scripts/distro_codegen.py ``` - updated README to start UI via conda builds. ## Test Plan ``` python llama_stack/scripts/distro_codegen.py ``` - Use newly generated `run.yaml` to start server ``` llama stack run ./llama_stack/templates/together/run.yaml ``` <img width="1191" alt="image" src="https://github.com/user-attachments/assets/62f7d179-0cd0-427c-b6e8-e087d4648f09"> #### Registration ``` ❯ llama-stack-client datasets register \ --dataset-id "mmlu" \ --provider-id "huggingface" \ --url "https://huggingface.co/datasets/llamastack/evals" \ --metadata '{"path": "llamastack/evals", "name": "evals__mmlu__details", "split": "train"}' \ --schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}' ❯ llama-stack-client datasets list ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓ ┃ identifier ┃ provider_id ┃ metadata ┃ type ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩ │ mmlu │ huggingface │ {'path': 'llamastack/evals', 'name': │ dataset │ │ │ │ 'evals__mmlu__details', 'split': │ │ │ │ │ 'train'} │ │ └────────────┴─────────────┴─────────────────────────────────────────┴─────────┘ ``` ``` ❯ llama-stack-client datasets register \ --dataset-id "simpleqa" \ --provider-id "huggingface" \ --url "https://huggingface.co/datasets/llamastack/evals" \ --metadata '{"path": "llamastack/evals", "name": "evals__simpleqa", "split": "train"}' \ --schema '{"input_query": {"type": "string"}, "expected_answer": {"type": "string", "chat_completion_input": {"type": "string"}}}' ❯ llama-stack-client datasets list ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓ ┃ identifier ┃ provider_id ┃ metadata ┃ type ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩ │ mmlu │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__mmlu__details', │ dataset │ │ │ │ 'split': 'train'} │ │ │ simpleqa │ huggingface │ {'path': 'llamastack/evals', 'name': 'evals__simpleqa', │ dataset │ │ │ │ 'split': 'train'} │ │ └────────────┴─────────────┴───────────────────────────────────────────────────────────────┴─────────┘ ``` ``` ❯ llama-stack-client eval_tasks register \ > --eval-task-id meta-reference-mmlu \ > --provider-id meta-reference \ > --dataset-id mmlu \ > --scoring-functions basic::regex_parser_multiple_choice_answer ❯ llama-stack-client eval_tasks register \ --eval-task-id meta-reference-simpleqa \ --provider-id meta-reference \ --dataset-id simpleqa \ --scoring-functions llm-as-judge::405b-simpleqa ❯ llama-stack-client eval_tasks list ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓ ┃ dataset_id ┃ identifier ┃ metadata ┃ provider_id ┃ provider_resour… ┃ scoring_functio… ┃ type ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩ │ mmlu │ meta-reference-… │ {} │ meta-reference │ meta-reference-… │ ['basic::regex_… │ eval_task │ │ simpleqa │ meta-reference-… │ {} │ meta-reference │ meta-reference-… │ ['llm-as-judge:… │ eval_task │ └────────────┴──────────────────┴──────────┴────────────────┴──────────────────┴──────────────────┴───────────┘ ``` #### Test with UI ``` streamlit run app.py ``` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2024-12-05 16:29:32 -08:00
Steve Grubb	a4daf4d3ec	Fix up safety client for versioned API (#573 ) When running: python -m llama_stack.apis.safety.client localhost 5000 The API server was logging: INFO: ::1:57176 - "POST /safety/run_shield HTTP/1.1" 404 Not Found This patch uses the versioned API, uses the updated safety endpoint, and updates the model name to what's being served. The above python command now demonstrates a passing and failing example.	2024-12-05 14:13:49 -08:00
Dalton Flanagan	6eb5f2a865	precommit	2024-12-05 16:36:26 -05:00
dltn	703a20c3bc	cprint in print_pip_install_help	2024-12-05 13:21:38 -08:00

... 15 16 17 18 19 ...

1500 commits