mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-16 21:42:38 +00:00
feat: added oci-s3 compatibility (#4374)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Python Package Build Test / build (3.12) (push) Successful in 16s
Python Package Build Test / build (3.13) (push) Successful in 17s
Test External API and Providers / test-external (venv) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (push) Failing after 50s
UI Tests / ui-tests (22) (push) Successful in 1m1s
Unit Tests / unit-tests (3.12) (push) Failing after 1m39s
Unit Tests / unit-tests (3.13) (push) Failing after 1m43s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m47s
Pre-commit / pre-commit (22) (push) Successful in 3m42s
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Python Package Build Test / build (3.12) (push) Successful in 16s
Python Package Build Test / build (3.13) (push) Successful in 17s
Test External API and Providers / test-external (venv) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (push) Failing after 50s
UI Tests / ui-tests (22) (push) Successful in 1m1s
Unit Tests / unit-tests (3.12) (push) Failing after 1m39s
Unit Tests / unit-tests (3.13) (push) Failing after 1m43s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m47s
Pre-commit / pre-commit (22) (push) Successful in 3m42s
# What does this PR do? The PR validates and allow access to OCI object-storage through the S3 compatibility API. Additional documentation for OCI is supplied, in notebook form, as well. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> --------- Co-authored-by: raghotham <rsm@meta.com>
This commit is contained in:
parent
805abf573f
commit
10c878d782
3 changed files with 1107 additions and 0 deletions
956
docs/notebooks/oci/OCI_LlamaStack_Demo.ipynb
Normal file
956
docs/notebooks/oci/OCI_LlamaStack_Demo.ipynb
Normal file
|
|
@ -0,0 +1,956 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dae5cac3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Oracle Cloud Infrastructure (OCI) with Llama Stack\n",
|
||||
"\n",
|
||||
"This notebook demonstrates how to start with using OCI Generative AI models through Llama Stack.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"\n",
|
||||
"1. **Install required packages:**\n",
|
||||
" ```bash\n",
|
||||
" pip install llama-stack-client oci\n",
|
||||
" ```\n",
|
||||
"\n",
|
||||
"2. **Configure OCI credentials:**\n",
|
||||
" - Set up `~/.oci/config` with your OCI credentials\n",
|
||||
" - Set the `OCI_COMPARTMENT_OCID` environment variable\n",
|
||||
" - Set the `OCI_REGION` environment variable\n",
|
||||
"\n",
|
||||
"3. **Start Llama Stack server:**\n",
|
||||
" ```bash\n",
|
||||
" llama stack run /oci/[your_oci_config].yaml\n",
|
||||
" ```\n",
|
||||
" Make sure to set OCI as your inference provider in your configuration file as shown here:\n",
|
||||
"```bash\n",
|
||||
"providers:\n",
|
||||
" inference:\n",
|
||||
" - provider_id: oci\n",
|
||||
" provider_type: remote::oci\n",
|
||||
" config:\n",
|
||||
" oci_auth_type: ${env.OCI_AUTH_TYPE:=instance_principal}\n",
|
||||
" oci_config_file_path: ${env.OCI_CONFIG_FILE_PATH:=~/.oci/config}\n",
|
||||
" oci_config_profile: ${env.OCI_CLI_PROFILE:=DEFAULT}\n",
|
||||
" oci_region: ${env.OCI_REGION:=us-ashburn-1}\n",
|
||||
" oci_compartment_id: ${env.OCI_COMPARTMENT_OCID:=}\n",
|
||||
"```\n",
|
||||
"5. **Verify server is running:**\n",
|
||||
" - Server should be accessible at `http://localhost:8321`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "9c9c27a4",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Python path updated to use venv\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# OPTION: Use venv environment with 0.4.0 client\n",
|
||||
"# Optional in case you need to select a specific venv enviornment.\n",
|
||||
"import sys\n",
|
||||
"sys.path.insert(0, 'oci/venv/lib/python3.12/site-packages')\n",
|
||||
"print(f\"Python path updated to use venv\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "65cfd094",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✅ OCI_COMPARTMENT_OCID is set\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Import required libraries\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"# Check if environment variable is set\n",
|
||||
"if not os.getenv(\"OCI_COMPARTMENT_OCID\"):\n",
|
||||
" print(\"⚠️ WARNING: OCI_COMPARTMENT_OCID environment variable not set\")\n",
|
||||
" print(\"Please set it with: export OCI_COMPARTMENT_OCID='ocid1.compartment.oc1..xxx'\")\n",
|
||||
"else:\n",
|
||||
" print(\"✅ OCI_COMPARTMENT_OCID is set\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "dff45663",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✅ Connected to Llama Stack server\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize the Llama Stack client\n",
|
||||
"# Make sure the server is running at http://localhost:8321\n",
|
||||
"\n",
|
||||
"client = LlamaStackClient(base_url=\"http://localhost:8321\")\n",
|
||||
"print(\"✅ Connected to Llama Stack server\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "07490c4e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1. List Available Models\n",
|
||||
"\n",
|
||||
"First, let's see what OCI models are available through Llama Stack."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "2aa0e436",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Found 11 models:\n",
|
||||
"\n",
|
||||
" oci/google.gemini-2.5-flash\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'google.gemini-2.5-flash'}\n",
|
||||
"\n",
|
||||
" oci/google.gemini-2.5-pro\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'google.gemini-2.5-pro'}\n",
|
||||
"\n",
|
||||
" oci/google.gemini-2.5-flash-lite\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'google.gemini-2.5-flash-lite'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-4-fast-non-reasoning\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-4-fast-non-reasoning'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-4-fast-reasoning\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-4-fast-reasoning'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-code-fast-1\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-code-fast-1'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-4\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-4'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-3-mini-fast\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-3-mini-fast'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-3-fast\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-3-fast'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-3\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-3'}\n",
|
||||
"\n",
|
||||
" oci/xai.grok-3-mini\n",
|
||||
" Provider: llama_stack\n",
|
||||
" Metadata: {'model_type': 'llm', 'provider_id': 'oci', 'provider_resource_id': 'xai.grok-3-mini'}\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# List all available models\n",
|
||||
"models = client.models.list()\n",
|
||||
"\n",
|
||||
"print(f\"Found {len(models)} models:\\n\")\n",
|
||||
"for model in models:\n",
|
||||
" print(f\" {model.id}\")\n",
|
||||
" print(f\" Provider: {model.owned_by}\")\n",
|
||||
" if hasattr(model, \"custom_metadata\") and model.custom_metadata:\n",
|
||||
" print(f\" Metadata: {model.custom_metadata}\")\n",
|
||||
" print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6c4c27c5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 2. Non-Streaming Chat Completion\n",
|
||||
"\n",
|
||||
"Let's run a simple chat completion request (non-streaming)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "41b1b7cd",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using model: oci/google.gemini-2.5-flash\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Select the first available model\n",
|
||||
"if len(models) == 0:\n",
|
||||
" print(\"No models available!\")\n",
|
||||
"else:\n",
|
||||
" model_id = models[0].id\n",
|
||||
" print(f\"Using model: {model_id}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "013fa2e7",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
" Response:\n",
|
||||
"================================================================================\n",
|
||||
"**Oracle Cloud Infrastructure (OCI)** is a suite of cloud computing services that runs on a global network of Oracle-managed data centers. It provides a complete range of highly automated, high-performance, and cost-effective services, including compute, storage, networking, databases, analytics, machine learning, IoT, and more.\n",
|
||||
"\n",
|
||||
"Essentially, OCI is Oracle's public cloud offering, designed to compete with industry giants like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).\n",
|
||||
"\n",
|
||||
"Here's a breakdown of what OCI is and what makes it stand out:\n",
|
||||
"\n",
|
||||
"1. **Cloud Computing Model:**\n",
|
||||
" * **Infrastructure as a Service (IaaS):** Provides fundamental computing resources (virtual machines, bare metal servers, storage, networking) over the internet. Users manage operating systems, applications, and data.\n",
|
||||
" * **Platform as a Service (PaaS):** Offers a platform for customers to develop, run, and manage applications without the complexity of building and maintaining the underlying infrastructure. This includes services like Oracle Autonomous Database, Kubernetes Engine, and Functions.\n",
|
||||
" * **Software as a Service (SaaS):** While OCI is primarily IaaS/PaaS, Oracle also offers many SaaS applications (like Fusion ERP, HCM, CRM) that run *on* OCI.\n",
|
||||
"\n",
|
||||
"2. **\"Generation 2 Cloud\" Architecture:**\n",
|
||||
" OCI often refers to itself as a \"Generation 2 Cloud.\" This implies a fundamental architectural difference from some older cloud platforms, focusing on:\n",
|
||||
" * **Performance:** Designed with non-oversubscribed resources (especially for bare metal compute), faster networking, and a focus on enterprise-grade workloads.\n",
|
||||
" * **Security-First:** A highly isolated network virtualization and a \"zero-trust\" security model from the ground up, aiming to prevent hypervisor attacks and provide strong isolation between customer workloads.\n",
|
||||
" * **Cost-Effectiveness:** Often boasts competitive pricing, especially for predictable, high-performance workloads, and strong support for \"bring your own license\" (BYOL) for Oracle software.\n",
|
||||
"\n",
|
||||
"3. **Key Services Offered:**\n",
|
||||
" * **Compute:** Virtual Machines (VMs), Bare Metal Servers (physical servers dedicated to a single customer), Container Engine for Kubernetes (OKE), Functions (serverless computing).\n",
|
||||
" * **Storage:** Block Storage, Object Storage (standard, infrequent access, archive tiers), File Storage, Database Storage.\n",
|
||||
" * **Networking:** Virtual Cloud Networks (VCNs), Load Balancers, VPN Connect, FastConnect (dedicated network connectivity).\n",
|
||||
" * **Databases:**\n",
|
||||
" * **Autonomous Database:** A flagship service that automates patching, tuning, security, and backups for data warehousing (ADW) and transaction processing (ATP).\n",
|
||||
" * Exadata Cloud Service, MySQL HeatWave, NoSQL Database, PostgreSQL.\n",
|
||||
" * **Analytics & AI/ML:** Data Lake, Data Catalog, AI Services, Machine Learning Platform.\n",
|
||||
" * **Application Development:** API Gateway, DevOps, Container Registry.\n",
|
||||
" * **Security:** Identity and Access Management (IAM), Cloud Guard, Web Application Firewall (WAF), Security Zones.\n",
|
||||
" * **Management & Governance:** Monitoring, Logging, Cost Management, Resource Manager (Terraform integration).\n",
|
||||
" * **Integration:** Oracle Integration Cloud (OIC).\n",
|
||||
"\n",
|
||||
"4. **Key Differentiators & Advantages:**\n",
|
||||
" * **Superior for Oracle Workloads:** Unmatched performance and features for running Oracle databases (especially Exadata and Autonomous Database) and Oracle applications (EBS, JD Edwards, PeopleSoft, Siebel).\n",
|
||||
" * **Enterprise Focus:** Built from the ground up for mission-critical, high-performance enterprise workloads.\n",
|
||||
" * **Performance & Price-Performance:** Often cited for better performance on certain benchmarks due to its architecture, leading to a strong price-performance ratio.\n",
|
||||
" * **Security:** Emphasizes strong isolation and a security-first design.\n",
|
||||
" * **Hybrid Cloud:** Offers solutions like Cloud@Customer and Dedicated Region Cloud@Customer for organizations that need cloud benefits within their own data centers.\n",
|
||||
"\n",
|
||||
"5. **Target Audience:**\n",
|
||||
" * Existing Oracle customers looking to migrate their on-premises databases and applications to the cloud.\n",
|
||||
" * Enterprises with demanding, performance-sensitive, or mission-critical workloads.\n",
|
||||
" * Organizations seeking strong database capabilities and automation.\n",
|
||||
" * Companies looking for competitive pricing and a robust security posture.\n",
|
||||
"\n",
|
||||
"In summary, OCI is Oracle's modern, high-performance, and secure public cloud platform, designed to cater to enterprise needs, with a particular strength in running Oracle's own software and databases.\n",
|
||||
"================================================================================\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Run a simple chat completion\n",
|
||||
"response = client.chat.completions.create(\n",
|
||||
" model=model_id,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"user\", \"content\": \"What is Oracle Cloud Infrastructure?\"}\n",
|
||||
" ],\n",
|
||||
" temperature=0.7,\n",
|
||||
" max_tokens=4096,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(\"\\n Response:\")\n",
|
||||
"print(\"=\" * 80)\n",
|
||||
"print(response.choices[0].message.content)\n",
|
||||
"print(\"=\" * 80)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "07d416b4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3. Streaming Chat Completion\n",
|
||||
"\n",
|
||||
"Now let's try streaming - the response will be printed token by token as it arrives."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "7a7d4aa0",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" Streaming Response:\n",
|
||||
"================================================================================\n",
|
||||
"Here are 3 key benefits of using Oracle Cloud Infrastructure (OCI) for AI workloads:\n",
|
||||
"\n",
|
||||
"1. **High-Performance Compute with Leading GPUs:** OCI offers powerful NVIDIA GPUs (such as A100s and H100s) on bare metal and high-core count virtual machines. This provides the raw, uncompromised compute power essential for rapidly training complex deep learning models, running large-scale simulations, and performing high-throughput inference, significantly reducing model development and deployment times.\n",
|
||||
"\n",
|
||||
"2. **Cost-Effectiveness and Flexible Pricing:** OCI is often recognized for its competitive pricing compared to other major cloud providers, especially for high-performance resources like GPUs. It also typically features lower data egress fees, which can lead to substantial cost savings for data-intensive AI workloads that frequently move large datasets in and out of the cloud. Flexible consumption models further help optimize spending.\n",
|
||||
"\n",
|
||||
"3. **Integrated AI/ML Services and MLOps Platform:** OCI provides a growing suite of managed AI services (e.g., OCI Data Science, OCI AI Vision, OCI AI Language, OCI AI Speech, OCI AI Anomaly Detection) that simplify the entire machine learning lifecycle. These services offer pre-trained models, easy-to-use APIs, and a robust MLOps platform for managing data, developing, deploying, and monitoring models, accelerating time-to-value for AI initiatives.\n",
|
||||
"================================================================================\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Run a streaming chat completion\n",
|
||||
"print(\" Streaming Response:\")\n",
|
||||
"print(\"=\" * 80)\n",
|
||||
"\n",
|
||||
"stream = client.chat.completions.create(\n",
|
||||
" model=model_id,\n",
|
||||
" messages=[\n",
|
||||
" {\n",
|
||||
" \"role\": \"user\",\n",
|
||||
" \"content\": \"List 3 benefits of using OCI for AI workloads.\"\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" temperature=0.7,\n",
|
||||
" max_tokens=4096,\n",
|
||||
" stream=True,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Print tokens as they arrive\n",
|
||||
"for chunk in stream:\n",
|
||||
" if hasattr(chunk, \"choices\") and len(chunk.choices) > 0:\n",
|
||||
" delta = chunk.choices[0].delta\n",
|
||||
" if hasattr(delta, \"content\") and delta.content:\n",
|
||||
" print(delta.content, end=\"\", flush=True)\n",
|
||||
"\n",
|
||||
"print(\"\\n\" + \"=\" * 80)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a3d18db2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4. Try Different Models\n",
|
||||
"\n",
|
||||
"You can experiment with different OCI models. Here are some examples:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "f5d07ef2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Available models:\n",
|
||||
"1. oci/google.gemini-2.5-flash\n",
|
||||
"2. oci/google.gemini-2.5-pro\n",
|
||||
"3. oci/google.gemini-2.5-flash-lite\n",
|
||||
"4. oci/xai.grok-4-fast-non-reasoning\n",
|
||||
"5. oci/xai.grok-4-fast-reasoning\n",
|
||||
"6. oci/xai.grok-code-fast-1\n",
|
||||
"7. oci/xai.grok-4\n",
|
||||
"8. oci/xai.grok-3-mini-fast\n",
|
||||
"9. oci/xai.grok-3-fast\n",
|
||||
"10. oci/xai.grok-3\n",
|
||||
"11. oci/xai.grok-3-mini\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# List all model IDs for easy reference\n",
|
||||
"print(\"Available models:\")\n",
|
||||
"for i, model in enumerate(models, 1):\n",
|
||||
" print(f\"{i}. {model.id}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "5732bcc0",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Switching to: oci/google.gemini-2.5-pro\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" Response:\n",
|
||||
"================================================================================\n",
|
||||
"No floppy disk, no silver sphere,\n",
|
||||
"No heavy drive you hold so dear.\n",
|
||||
"Your data’s gone, it flew away\n",
|
||||
"To live and breathe a brighter day.\n",
|
||||
"\n",
|
||||
"It rests within a nebulous haze,\n",
|
||||
"Through sunlit and through moonlit days.\n",
|
||||
"A wisp of thought, a digital stream,\n",
|
||||
"The substance of a modern dream.\n",
|
||||
"\n",
|
||||
"You pull it down on phone or screen,\n",
|
||||
"A distant file, a long-lost scene.\n",
|
||||
"A document, a shared design,\n",
|
||||
"No longer solely yours or mine.\n",
|
||||
"\n",
|
||||
"But this soft cloud is not of rain,\n",
|
||||
"It's built on a terrestrial plane.\n",
|
||||
"Of humming racks in cooled, vast halls,\n",
|
||||
"Behind secure and fireproof walls.\n",
|
||||
"\n",
|
||||
"A million lights that blink and gleam,\n",
|
||||
"A flowing, cool, electric stream.\n",
|
||||
"A silent army, code and wire,\n",
|
||||
"That serves the world's immense desire.\n",
|
||||
"\n",
|
||||
"It’s more than storage, safe and deep,\n",
|
||||
"While all our local systems sleep.\n",
|
||||
"It’s rented power, brain, and brawn,\n",
|
||||
"To calculate from dusk till dawn.\n",
|
||||
"\n",
|
||||
"So tap your key and make the call,\n",
|
||||
"The cloud provides and serves us all.\n",
|
||||
"A weightless vault, beyond the blue,\n",
|
||||
"That holds the work, the world, and you.\n",
|
||||
"================================================================================\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Try a different model (change the index to try different models)\n",
|
||||
"if len(models) > 1:\n",
|
||||
" model_id = models[1].id # Try the second model\n",
|
||||
" print(f\"Switching to: {model_id}\\n\")\n",
|
||||
"\n",
|
||||
" response = client.chat.completions.create(\n",
|
||||
" model=model_id,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"user\", \"content\": \"Write a poem about cloud computing.\"}\n",
|
||||
" ],\n",
|
||||
" temperature=0.9,\n",
|
||||
" max_tokens=4096,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" print(\" Response:\")\n",
|
||||
" print(\"=\" * 80)\n",
|
||||
" print(response.choices[0].message.content)\n",
|
||||
" print(\"=\" * 80)\n",
|
||||
"else:\n",
|
||||
" print(\"Only one model available\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "59af223c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5. Multi-turn Conversation\n",
|
||||
"\n",
|
||||
"You can maintain conversation context by including previous messages."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "d26bfe9c",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" Turn 1:\n",
|
||||
"Oracle Cloud Infrastructure (OCI) is a suite of cloud computing services that runs on the Oracle Cloud Infrastructure platform. It aims to provide enterprise-grade performance, security, and cost-effectiveness for a wide range of workloads.\n",
|
||||
"\n",
|
||||
"Here are the main features of OCI, categorized for clarity:\n",
|
||||
"\n",
|
||||
"1. **Core Infrastructure Services:**\n",
|
||||
" * **Compute:**\n",
|
||||
" * **Virtual Machines (VMs):** Flexible and scalable virtual servers.\n",
|
||||
" * **Bare Metal Instances:** Dedicated physical servers for high-performance workloads, offering direct access to hardware resources without virtualization overhead.\n",
|
||||
" * **Container Engine for Kubernetes (OKE):** A fully managed Kubernetes service for deploying, managing, and scaling containerized applications.\n",
|
||||
" * **Functions:** Serverless computing platform that allows you to run code without provisioning or managing servers.\n",
|
||||
" * **Storage:**\n",
|
||||
" * **Block Volume:** High-performance, persistent block storage for compute instances.\n",
|
||||
" * **Object Storage:** Highly scalable, S3-compatible object storage for unstructured data, available in standard and archival tiers.\n",
|
||||
" * **File Storage (NFS):** Managed file storage service, accessible via NFS protocol.\n",
|
||||
" * **Archive Storage:** Extremely low-cost, long-term storage for infrequently accessed data.\n",
|
||||
" * **Networking:**\n",
|
||||
" * **Virtual Cloud Network (VCN):** A customizable, software-defined network that provides an isolated and secure network environment for your OCI resources.\n",
|
||||
" * **Load Balancing:** Distributes incoming traffic across multiple instances to ensure high availability and performance.\n",
|
||||
" * **DNS:** Managed Domain Name System service.\n",
|
||||
" * **VPN Connect & FastConnect:** Secure connectivity options for hybrid cloud scenarios, connecting on-premises data centers to OCI.\n",
|
||||
"\n",
|
||||
"2. **Database Services:**\n",
|
||||
" * **Oracle Autonomous Database (ADB):** This is a cornerstone feature of OCI. It's a fully automated, self-driving, self-securing, and self-repairing database service for data warehousing (ADW) and transaction processing (ATP). It handles patching, backups, tuning, and scaling automatically.\n",
|
||||
" * **Database as a Service (DBaaS):** Managed Oracle Databases (including Exadata Cloud Service) and support for open-source databases like MySQL HeatWave.\n",
|
||||
"\n",
|
||||
"3. **Security and Identity:**\n",
|
||||
" * **Identity and Access Management (IAM):** Comprehensive service for managing users, groups, policies, and compartments to control access to OCI resources.\n",
|
||||
" * **Cloud Guard:** A security posture management service that monitors OCI resources for security vulnerabilities and misconfigurations.\n",
|
||||
" * **Security Zones:** Enforce strict security policies from the start, preventing users from performing actions that violate best practices.\n",
|
||||
" * **Web Application Firewall (WAF):** Protects web applications from common web exploits.\n",
|
||||
" * **Key Management (Vault):** Managed service for securely storing and managing encryption keys and secrets.\n",
|
||||
" * **DDoS Protection:** Built-in protection against Distributed Denial of Service attacks.\n",
|
||||
" * **Isolated Network Virtualization:** OCI's network architecture isolates customer networks from Oracle's network control plane, enhancing security and performance.\n",
|
||||
"\n",
|
||||
"4. **Management, Governance, and Observability:**\n",
|
||||
" * **Monitoring:** Collects metrics on resource performance and health.\n",
|
||||
" * **Logging:** Centralized service for ingesting, storing, and analyzing logs from various OCI services.\n",
|
||||
" * **Resource Manager (Terraform):** Infrastructure-as-code service for provisioning and managing OCI resources using Terraform configurations.\n",
|
||||
" * **Cost Management:** Tools for tracking, analyzing, and optimizing OCI spending.\n",
|
||||
" * **Compartments:** Logical containers for organizing and isolating OCI resources, crucial for governance and access control.\n",
|
||||
" * **Audit:** Provides a chronological log of all API calls made to your OCI resources.\n",
|
||||
"\n",
|
||||
"5. **Developer Tools and Application Development:**\n",
|
||||
" * **DevOps Service:** End-to-end CI/CD platform for automating software delivery.\n",
|
||||
" * **API Gateway:** Manages, publishes, monitors, and secures APIs.\n",
|
||||
" * **Queue and Streaming:** Messaging services for real-time data ingestion and processing.\n",
|
||||
" * **Service Mesh:** Managed service for connecting, monitoring, and securing microservices.\n",
|
||||
"\n",
|
||||
"6. **AI & Machine Learning:**\n",
|
||||
" * **OCI Data Science:** Platform for building, training, and deploying machine learning models.\n",
|
||||
" * **AI Services:** Pre-built AI services for common tasks like Vision, Language, Speech, and Anomaly Detection.\n",
|
||||
"\n",
|
||||
"7. **Hybrid and Edge Cloud:**\n",
|
||||
" * **Dedicated Region Cloud@Customer:** Allows customers to run a full OCI region within their own data center, offering identical services, APIs, and performance of a public OCI region.\n",
|
||||
" * **Roving Edge Infrastructure:** Portable, ruggedized compute and storage nodes for running OCI services at the edge, ideal for disconnected or remote environments.\n",
|
||||
"\n",
|
||||
"**Key Differentiators of OCI:**\n",
|
||||
"\n",
|
||||
"* **High Performance & Enterprise Focus:** Designed from the ground up for high-performance, critical enterprise workloads, offering bare metal instances and a flat, low-latency network.\n",
|
||||
"* **Cost-Effectiveness & Predictable Pricing:** Often boasts lower pricing for comparable services, especially for data egress, with transparent and predictable billing.\n",
|
||||
"* **Autonomous Services:** The \"self-driving, self-securing, self-repairing\" philosophy, particularly embodied by the Autonomous Database, significantly reduces operational overhead.\n",
|
||||
"* **Security-First Architecture:** Built with security as a core tenet, including isolated network virtualization and strong default security postures.\n",
|
||||
"* **Comprehensive Hybrid Cloud Strategy:** Unique offerings like Cloud@Customer provide unparalleled flexibility for hybrid deployments.\n",
|
||||
"* **Oracle Database Expertise:** Best-in-class support and optimization for Oracle Database workloads, including Exadata.\n",
|
||||
"\n",
|
||||
"In essence, OCI positions itself as a high-performance, secure, and cost-effective cloud platform, particularly strong for complex enterprise applications and Oracle workloads, while also offering a broad array of modern cloud services.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" Turn 2:\n",
|
||||
"OCI's compute features are designed to provide a flexible, high-performance, and cost-effective foundation for running a wide variety of workloads, from traditional enterprise applications to modern cloud-native and high-performance computing (HPC) tasks.\n",
|
||||
"\n",
|
||||
"Here's an elaboration on the main compute features:\n",
|
||||
"\n",
|
||||
"1. **Virtual Machines (VMs):**\n",
|
||||
" * **Description:** Standard virtual servers that run on shared underlying physical hardware. They offer a balance of flexibility, scalability, and cost-effectiveness.\n",
|
||||
" * **Key Characteristics:**\n",
|
||||
" * **Instance Shapes:** OCI offers a wide array of VM shapes, including:\n",
|
||||
" * **Standard Shapes (e.g., VM.Standard.E4.Flex, VM.Standard.A1.Flex):** General-purpose shapes with different CPU architectures (Intel, AMD, Ampere ARM A1) and the ability to customize CPU and memory resources independently (Flex shapes), allowing for precise resource allocation and cost optimization.\n",
|
||||
" * **Optimized Shapes:** Shapes specifically designed for high-performance computing (HPC) or memory-intensive workloads.\n",
|
||||
" * **GPU Shapes:** VMs equipped with powerful GPUs for AI/ML training, graphics rendering, and scientific simulations.\n",
|
||||
" * **Operating Systems:** Support for various Linux distributions (Oracle Linux, Ubuntu, CentOS, RHEL, etc.) and Windows Server.\n",
|
||||
" * **Scalability:** VMs can be resized (vertically scaled) to different shapes or horizontally scaled using instance pools and autoscaling configurations based on metrics like CPU utilization.\n",
|
||||
" * **Networking:** Integrated with Virtual Cloud Networks (VCNs), allowing for private and public IP addresses, security lists, and network security groups.\n",
|
||||
" * **Storage:** Boot volumes are persistent block storage, and additional block volumes can be attached for application data.\n",
|
||||
" * **Use Cases:** Web servers, application servers, development and testing environments, small to medium databases, general-purpose enterprise applications.\n",
|
||||
"\n",
|
||||
"2. **Bare Metal Instances:**\n",
|
||||
" * **Description:** Dedicated physical servers where you have direct access to the underlying hardware, with no hypervisor layer between your operating system and the physical server.\n",
|
||||
" * **Key Characteristics:**\n",
|
||||
" * **Maximum Performance:** Offers the highest possible performance, I/O throughput, and lowest latency because there's no virtualization overhead.\n",
|
||||
" * **Complete Isolation:** Provides single-tenant isolation, enhancing security and compliance for sensitive workloads.\n",
|
||||
" * **Specialized Hardware:** Available in shapes optimized for specific tasks, including:\n",
|
||||
" * **High-Performance Computing (HPC):** Often featuring high core counts, large amounts of RAM, and low-latency RDMA (Remote Direct Memory Access) networking (e.g., InfiniBand) for parallel processing.\n",
|
||||
" * **GPU Instances:** Physical servers with multiple high-end GPUs for demanding AI/ML, scientific, and rendering workloads.\n",
|
||||
" * **Dense I/O Instances:** Equipped with large amounts of local NVMe SSD storage for I/O-intensive applications.\n",
|
||||
" * **Operating Systems:** Similar to VMs, supports various Linux and Windows OS.\n",
|
||||
" * **Use Cases:** High-performance databases (e.g., Oracle Exadata on OCI, large commercial databases), HPC workloads, Big Data analytics, gaming servers, CAD/CAM, applications with strict licensing requirements tied to physical cores, or workloads requiring absolute performance predictability.\n",
|
||||
"\n",
|
||||
"3. **Container Engine for Kubernetes (OKE):**\n",
|
||||
" * **Description:** A fully managed Kubernetes service that simplifies the deployment, scaling, and management of containerized applications. Oracle handles the Kubernetes control plane, while you manage the worker nodes.\n",
|
||||
" * **Key Characteristics:**\n",
|
||||
" * **Managed Control Plane:** Oracle provisions, upgrades, and maintains the Kubernetes master nodes, ensuring high availability and patching.\n",
|
||||
" * **Worker Node Flexibility:** You can choose VM or Bare Metal instances for your worker nodes, providing flexibility in performance and cost.\n",
|
||||
" * **Deep OCI Integration:** Seamlessly integrates with other OCI services like Load Balancer (for ingress), Block Storage (for persistent volumes), IAM (for access control), and VCN (for networking).\n",
|
||||
" * **Open Source Compatibility:** Adheres to open-source Kubernetes standards, allowing for portability of applications.\n",
|
||||
" * **Autoscaling:** Supports both horizontal pod autoscaling (HPA) and cluster autoscaling (adding/removing worker nodes).\n",
|
||||
" * **Use Cases:** Microservices architectures, CI/CD pipelines, cloud-native application development, stateless and stateful containerized applications, event-driven processing.\n",
|
||||
"\n",
|
||||
"4. **Functions (Serverless Compute):**\n",
|
||||
" * **Description:** A serverless platform based on the open-source Fn Project, allowing you to deploy and run code without provisioning, managing, or scaling servers. You only pay for the compute resources consumed during execution.\n",
|
||||
" * **Key Characteristics:**\n",
|
||||
" * **Event-Driven:** Functions are typically triggered by events from other OCI services (e.g., file upload to Object Storage, messages in Streaming, API Gateway requests) or custom events.\n",
|
||||
" * **Automatic Scaling:** Automatically scales up and down based on demand, from zero to thousands of concurrent executions.\n",
|
||||
" * **Pay-per-Execution:** You are billed only for the compute time your code runs, making it extremely cost-effective for intermittent or unpredictable workloads.\n",
|
||||
" * **Language Support:** Supports various popular programming languages like Python, Node.js, Java, Go, Ruby, and C#.\n",
|
||||
" * **Integration:** Integrates with OCI API Gateway, Streaming, Object Storage, and other services.\n",
|
||||
" * **Use Cases:** Data processing (e.g., image resizing, log analysis), chatbots, IoT backends, webhooks, API backends, scheduled tasks, serverless ETL.\n",
|
||||
"\n",
|
||||
"**Common Features Across OCI Compute Services:**\n",
|
||||
"\n",
|
||||
"* **Networking:** All compute instances are provisioned within a Virtual Cloud Network (VCN), allowing for secure and isolated network environments.\n",
|
||||
"* **Storage Integration:** Seamless integration with OCI's various storage services (Block Volumes, Object Storage, File Storage).\n",
|
||||
"* **Identity and Access Management (IAM):** Granular control over who can access and manage your compute resources.\n",
|
||||
"* **Monitoring and Logging:** Built-in services to collect metrics (CPU, memory, network I/O) and logs for performance analysis and troubleshooting.\n",
|
||||
"* **Image Management:** Ability to use platform-provided images or create and manage custom images for consistent deployments.\n",
|
||||
"* **Resource Manager (Terraform Integration):** Automate the provisioning and management of compute resources using Infrastructure-as-Code.\n",
|
||||
"\n",
|
||||
"OCI's compute offerings emphasize **performance, flexibility, and cost-efficiency**, particularly for enterprise workloads and those requiring dedicated hardware or specific performance characteristics. The inclusion of Bare Metal and the highly customizable Flex shapes are key differentiators.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Multi-turn conversation example\n",
|
||||
"conversation = [\n",
|
||||
" {\"role\": \"user\", \"content\": \"What are the main features of OCI?\"},\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# First turn\n",
|
||||
"response1 = client.chat.completions.create(\n",
|
||||
" model=models[0].id,\n",
|
||||
" messages=conversation,\n",
|
||||
" temperature=0.7,\n",
|
||||
" max_tokens=4096,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"first_response = response1.choices[0].message.content\n",
|
||||
"print(\" Turn 1:\")\n",
|
||||
"print(first_response)\n",
|
||||
"print()\n",
|
||||
"\n",
|
||||
"# Add assistant response to conversation\n",
|
||||
"conversation.append({\"role\": \"assistant\", \"content\": first_response})\n",
|
||||
"conversation.append({\"role\": \"user\", \"content\": \"Can you elaborate on the compute features?\"})\n",
|
||||
"\n",
|
||||
"# Second turn\n",
|
||||
"response2 = client.chat.completions.create(\n",
|
||||
" model=models[0].id,\n",
|
||||
" messages=conversation,\n",
|
||||
" temperature=0.7,\n",
|
||||
" max_tokens=4096,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(\" Turn 2:\")\n",
|
||||
"print(response2.choices[0].message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "02b9601f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6. Adjusting Parameters\n",
|
||||
"\n",
|
||||
"You can control the model's behavior with various parameters:\n",
|
||||
"\n",
|
||||
"- **temperature** (0.0-2.0): Controls randomness. Lower = more focused, Higher = more creative\n",
|
||||
"- **max_tokens**: Maximum length of the response\n",
|
||||
"- **stream**: Enable/disable streaming"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "447ea15e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" Creative response (temperature=1.5):\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Data ascended into the luminous, silent sky – billions of bits housed within ethereal data banks: the Cloud. Resources scaled, programs ran anywhere, like magic. Want to save a dream, stream a cosmos, or build worlds? With an invisible whisper, your processing wishes flowed freely from the vast collective.\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\n",
|
||||
" Focused response (temperature=0.3):\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Cloud computing delivers on-demand computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet (\"the cloud\") on a pay-as-you-go basis.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Example: Creative response with high temperature\n",
|
||||
"print(\" Creative response (temperature=1.5):\\n\")\n",
|
||||
"response = client.chat.completions.create(\n",
|
||||
" model=models[0].id,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"user\", \"content\": \"Tell me a creative story about cloud computing in 50 words.\"}\n",
|
||||
" ],\n",
|
||||
" temperature=1.5,\n",
|
||||
" max_tokens=4096,\n",
|
||||
")\n",
|
||||
"print(response.choices[0].message.content)\n",
|
||||
"\n",
|
||||
"print(\"\\n\" + \"-\" * 80 + \"\\n\")\n",
|
||||
"\n",
|
||||
"# Example: Focused response with low temperature\n",
|
||||
"print(\" Focused response (temperature=0.3):\\n\")\n",
|
||||
"response = client.chat.completions.create(\n",
|
||||
" model=models[0].id,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"user\", \"content\": \"What is cloud computing? Be concise.\"}\n",
|
||||
" ],\n",
|
||||
" temperature=0.3,\n",
|
||||
" max_tokens=4096,\n",
|
||||
")\n",
|
||||
"print(response.choices[0].message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "5f1fb0ba-330c-4253-8199-3635bdfe0abf",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✅ Created agent successfully\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from llama_stack_client import LlamaStackClient, Agent\n",
|
||||
"# Create a basic agent using the Agent class\n",
|
||||
"agent = Agent(\n",
|
||||
" client=client,\n",
|
||||
" model=models[0].id,\n",
|
||||
" instructions=\"You are a helpful AI assistant that can answer questions and help with tasks.\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(\"✅ Created agent successfully\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "4c414c8a-0e65-449b-be79-88f7e11e8de4",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/conversations \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✅ Created session: conv_3661f0b6f4a504617c6e47e6b3273687383173bc456b1826\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Create agent session\n",
|
||||
"basic_session_id = agent.create_session(session_name=\"basic_example_session\")\n",
|
||||
"\n",
|
||||
"print(f\"✅ Created session: {basic_session_id}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "4d1e95d0-3ce5-447c-8e8b-bfffb1200ccc",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"User: What is the capital of England?\n",
|
||||
"\n",
|
||||
"Assistant: The capital of England is **London**.\n",
|
||||
"✅ Response captured: 37 characters\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Send a message to the agent with streaming\n",
|
||||
"query = \"What is the capital of England?\"\n",
|
||||
"\n",
|
||||
"print(f\"User: {query}\\n\")\n",
|
||||
"print(\"Assistant: \", end='')\n",
|
||||
"\n",
|
||||
"# Create a turn with streaming\n",
|
||||
"response = agent.create_turn(\n",
|
||||
" session_id=basic_session_id,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"user\", \"content\": query}\n",
|
||||
" ],\n",
|
||||
" stream=True,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Stream the response\n",
|
||||
"output_text = \"\"\n",
|
||||
"for chunk in response:\n",
|
||||
" if chunk.event.event_type == \"turn_completed\":\n",
|
||||
" output_text = chunk.event.final_text\n",
|
||||
" #print(output_text)\n",
|
||||
" break\n",
|
||||
" elif chunk.event.event_type == \"step_progress\":\n",
|
||||
" # Print text deltas as they arrive\n",
|
||||
" if hasattr(chunk.event.delta, 'text'):\n",
|
||||
" print(chunk.event.delta.text, end='', flush=True)\n",
|
||||
"\n",
|
||||
"print(f\"\\n✅ Response captured: {len(output_text)} characters\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.7"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
122
docs/notebooks/oci/OCI_ObjectStore_Demo.ipynb
Normal file
122
docs/notebooks/oci/OCI_ObjectStore_Demo.ipynb
Normal file
|
|
@ -0,0 +1,122 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# OCI Object Store Demo with Llama Stack\n",
|
||||
"This notebook demonstrates how to set up OCI Object Storage with Llama Stack."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Step 1: Update config.yaml\n",
|
||||
"Update your `config.yaml` to include the S3 configuration for OCI Object Storage as described in the [README](src/llama_stack/providers/remote/files/s3/README.md).\n",
|
||||
"### Example config.yaml configuration\n",
|
||||
"```yaml\n",
|
||||
"provider_type: remote::s3\n",
|
||||
"config:\n",
|
||||
" bucket_name: \"${env.S3_BUCKET_NAME}\"\n",
|
||||
" region: \"${env.AWS_REGION:=us-east-1}\"\n",
|
||||
" aws_access_key_id: \"${env.AWS_ACCESS_KEY_ID:=}\"\n",
|
||||
" aws_secret_access_key: \"${env.AWS_SECRET_ACCESS_KEY:=}\"\n",
|
||||
" endpoint_url: \"${env.S3_ENDPOINT_URL:=}\"\n",
|
||||
" metadata_store:\n",
|
||||
" table_name: files_metadata\n",
|
||||
" backend: sql_default\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Step 2: Set Environment Variables\n",
|
||||
"Create a `.env` file with your OCI credentials and bucket details. \n",
|
||||
"For more information on generating the access/secret keys, visit this [document](https://docs.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)\n",
|
||||
"For information on the 'checksum' variables see this [document](https://www.ateam-oracle.com/post/using-oci-os-s3-interface)\n",
|
||||
"\n",
|
||||
"### Example .env file content\n",
|
||||
"```\n",
|
||||
"AWS_ACCESS_KEY_ID=OCI_ACCESS_KEY \n",
|
||||
"AWS_SECRET_ACCESS_KEY=OCI_SECRET_KEY \n",
|
||||
"S3_BUCKET_NAME=OCI_BUCKET_NAME \n",
|
||||
"S3_ENDPOINT_URL=https://<namespace>.compat.objectstorage.<region>.oci.customer-oci.com \n",
|
||||
"AWS_REQUEST_CHECKSUM_CALCULATION=when_required \n",
|
||||
"AWS_RESPONSE_CHECKSUM_VALIDATION=when_required \n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Step 3: Run Llama Stack Locally\n",
|
||||
"Run the following command to start the Llama Stack server locally:\n",
|
||||
"\n",
|
||||
"To set-up your envionment and first-time run of llama-stack visit the repo and view the [CONTRIBUTING](https://github.com/llamastack/llama-stack/blob/main/CONTRIBUTING.md) document"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!uv run --env-file=.env llama stack run oci"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Step 4: Upload and List Files using Files API"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import requests\n",
|
||||
"\n",
|
||||
"# Upload a file\n",
|
||||
"source = \"https://www.paulgraham.com/greatwork.html\"\n",
|
||||
"response = requests.get(source)\n",
|
||||
"files = {'file': ('greatwork.html', response.content, 'text/html')}\n",
|
||||
"data = {'purpose': 'assistants'}\n",
|
||||
"response = requests.post('http://0.0.0.0:8321/v1/files', files=files, data=data)\n",
|
||||
"print(response.text)\n",
|
||||
"\n",
|
||||
"# List uploaded files\n",
|
||||
"items = requests.get('http://0.0.0.0:8321/v1/files')\n",
|
||||
"for item in items.json()['data']:\n",
|
||||
" print(item['id'])"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "llama-stack",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
|
|
@ -210,6 +210,35 @@ config:
|
|||
aws_secret_access_key: minioadmin
|
||||
```
|
||||
|
||||
### Using OCI Object Storage with S3 Compatibility
|
||||
[Official Object Storage Amazon S3 Compatibility API Documentation](https://docs.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
|
||||
|
||||
OCI Object Storage can be utilized through the OCI S3 Compatibility API. Simply Update the `config.yaml` and set the env-vars as below.
|
||||
|
||||
#### config.yaml
|
||||
```yaml
|
||||
provider_type: remote::s3
|
||||
config:
|
||||
bucket_name: "${env.S3_BUCKET_NAME}"
|
||||
region: "${env.AWS_REGION:=us-east-1}"
|
||||
aws_access_key_id: "${env.AWS_ACCESS_KEY_ID:=}"
|
||||
aws_secret_access_key: "${env.AWS_SECRET_ACCESS_KEY:=}"
|
||||
endpoint_url: "${env.S3_ENDPOINT_URL:=}"
|
||||
metadata_store:
|
||||
table_name: files_metadata
|
||||
backend: sql_default
|
||||
```
|
||||
#### .env
|
||||
```
|
||||
AWS_ACCESS_KEY_ID=OCI_ACCESS_KEY
|
||||
AWS_SECRET_ACCESS_KEY=OCI_SECRET_KEY
|
||||
S3_BUCKET_NAME=OCI_BUCKET_NAME
|
||||
S3_ENDPOINT_URL=https://<namespace>.compat.objectstorage.<region>.oci.customer-oci.com
|
||||
AWS_REQUEST_CHECKSUM_CALCULATION=when_required
|
||||
AWS_RESPONSE_CHECKSUM_VALIDATION=when_required
|
||||
```
|
||||
|
||||
|
||||
## Monitoring and Logging
|
||||
|
||||
The provider logs important operations and errors. For production deployments, consider:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue