mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-21 03:59:42 +00:00
Add high-level instructions
This commit is contained in:
parent
7faec2380a
commit
84e85e824a
1 changed files with 29 additions and 5 deletions
|
@ -4,7 +4,23 @@
|
|||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This notebook contains Llama Stack implementation of a common end-to-end workflow for customizing and evaluating LLMs using the NVIDIA provider."
|
||||
"## NVIDIA E2E Flow"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This notebook contains a Llama Stack implementation for an end-to-end workflow for running inference, customizing, and evaluating LLMs using the NVIDIA provider.\n",
|
||||
"\n",
|
||||
"The NVIDIA provider leverages the NeMo Microservices platform, a collection of microservices that you can use to build AI workflows on your Kubernetes cluster on-prem or in cloud.\n",
|
||||
"\n",
|
||||
"This notebook covers the following workflows:\n",
|
||||
"- Creating a dataset and uploading files\n",
|
||||
"- Customizing models\n",
|
||||
"- Evaluating base and customized models, with and without guardrails\n",
|
||||
"- Running inference on base and customized models, with and without guardrails\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -12,7 +28,7 @@
|
|||
"metadata": {},
|
||||
"source": [
|
||||
"## Prerequisites\n",
|
||||
"First, ensure the NeMo Microservices platform is up and running, including the model downloading step for `meta/llama-3.2-1b-instruct`. See installation instructions: https://aire.gitlab-master-pages.nvidia.com/microservices/documentation/latest/nemo-microservices/latest-internal/set-up/deploy-as-platform/index.html (TODO: Update to public docs)"
|
||||
"First, ensure the NeMo Microservices platform is up and running, including the model downloading step for `meta/llama-3.1-8b-instruct`. See installation instructions: https://aire.gitlab-master-pages.nvidia.com/microservices/documentation/latest/nemo-microservices/latest-internal/set-up/deploy-as-platform/index.html (TODO: Update to public docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -70,7 +86,7 @@
|
|||
"source": [
|
||||
"Configure the environment variables for each service.\n",
|
||||
"\n",
|
||||
"If needed, update the URLs for each service to point to your deployment.\n",
|
||||
"Ensure the URLs for each service point to your deployment.\n",
|
||||
"- NDS_URL: NeMo Data Store URL\n",
|
||||
"- NEMO_URL: NeMo Microservices Platform URL\n",
|
||||
"- NIM_URL: NIM URL\n",
|
||||
|
@ -94,7 +110,7 @@
|
|||
"USER_ID = \"llama-stack-user\"\n",
|
||||
"NAMESPACE = \"default\"\n",
|
||||
"PROJECT_ID = \"\"\n",
|
||||
"CUSTOMIZED_MODEL_DIR = \"jg-test-llama-stack@v2\"\n",
|
||||
"CUSTOMIZED_MODEL_DIR = \"test-llama-stack@v1\"\n",
|
||||
"\n",
|
||||
"# Inference env vars\n",
|
||||
"os.environ[\"NVIDIA_BASE_URL\"] = NIM_URL\n",
|
||||
|
@ -156,13 +172,19 @@
|
|||
"client.initialize()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here, we define helper functions that wait for async jobs to complete."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Helper functions for waiting on jobs\n",
|
||||
"from llama_stack.apis.common.job_types import JobStatus\n",
|
||||
"\n",
|
||||
"def wait_customization_job(job_id: str, polling_interval: int = 10, timeout: int = 6000):\n",
|
||||
|
@ -204,6 +226,8 @@
|
|||
"\n",
|
||||
" return job_status\n",
|
||||
"\n",
|
||||
"# When creating a customized model, NIM asynchronously loads the model in its model registry.\n",
|
||||
"# After this, we can run inference with the new model. This helper function waits for NIM to pick up the new model.\n",
|
||||
"def wait_nim_loads_customized_model(model_id: str, namespace: str, polling_interval: int = 10, timeout: int = 300):\n",
|
||||
" found = False\n",
|
||||
" start_time = time()\n",
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue