Scaling for the multicloud AI strategy: Inside Oracle’s approach

Nathan Thomas explains how Oracle is scaling its multicloud AI strategy through GPU superclusters, Acceleron networking, unified data, and engineering-led innovation.

31 Oct 2025 18:12 IST

New Update

Nathan Thomas, Vice President of Multicloud and AI Strategic Initiatives at Oracle, brings deep cross-industry expertise to the company’s cloud and artificial intelligence (AI) strategy. Before Oracle, he led Unreal Engine as Vice President at Epic Games and headed Product Management for Storage at Google Cloud. Today, he drives Oracle’s multicloud roadmap and AI initiatives, enabling customers to use Oracle databases and AI services on the cloud provider of their choice.

Advertisment

Speaking to Voice&Data at the Oracle AI World in Las Vegas, he breaks down Oracle’s latest engineering-led advances—from Zettascale GPU superclusters, Acceleron networking and Helios rack architecture to the Oracle AI Data Platform and unified agentic-AI framework. He also discusses open standards, the balance between NVIDIA and AMD GPUs, sustainability, and Oracle’s India strategy. For technology leaders tracking high-density AI infrastructure and multicloud ecosystems, this conversation offers clear insights into Oracle’s next phase of innovation.

How is Oracle Cloud Infrastructure (OCI) integrating the recently announced agentic artificial intelligence (AI) capabilities? What does that look like when you translate it into real customer use cases?
The foundation for agentic AI begins with infrastructure. We require a very strong foundation for our own workloads as well as for the workloads that customers build on OCI. Many of our announcements at the Oracle AI World are rooted directly in infrastructure innovation.

Acceleron for network fabrics is a major highlight. We are reducing packet latency, strengthening node-to-node communication, and embedding Zero-Trust Packet Routing (ZPR) directly at the physical layer. When Network Interface Cards (NICs) are combined with Graphics Processing Units (GPUs), businesses achieve very high performance with end-to-end security boundaries. These innovations feed into capabilities such as OCI Zettascale 10, supporting the next generation of GPU cluster scale.

Advertisment

We also announced 131,000-GPU superclusters. Scaling at that pace requires proper infrastructure. Innovations such as rapid-deployment cabling and closed-loop, non-evaporative cooling systems are essential. Cooling, may appear mundane, but it enables high-density rack placement, faster provisioning, and flexibility in late hardware placement decisions.

Cooling may appear mundane, but it fundamentally determines how far you can push performance in large-scale AI systems.

This foundational layer supports the agentic AI capabilities built upon it. OCI Generative AI (GenAI) services run on the same infrastructure, enabling token-based and pay-as-you-go consumption models like Gemini 2.5 within OCI’s multi-cloud ecosystem.

Advertisment

From there, the data and applications layer becomes critical. Oracle AI Database 26ai ensures that enterprise data is prepared for AI workloads through hybrid vector search, Memory-Centric Processing (MCP) server integration, and robust agentic-development frameworks. Enterprises want openness for generative AI, but with strong control and genuine data-level security. That principle guides the architecture.

OCI agentic frameworks and OCI GenAI services integrate directly into Oracle Fusion Applications, completing a full stack—from infrastructure upward—designed to deliver enterprise AI value.

On the infrastructure side, Oracle has announced large deployments with both NVIDIA and AMD GPUs. How do you balance these two architectures? Is the architectural direction shifting in any way?
We see significant demand for both technologies. Our growth is guided directly by clear customer demand across specific AI use cases. That drives all of our regional build-outs and capacity expansion.

Advertisment

Large NVIDIA GB200 and GB300 clusters remain central and will continue to be so. At the same time, we are very enthusiastic about the AMD MI355X and the MI450 developments. The Oracle Insight Platform plays a major role in enabling this strategy and there is plenty of opportunity for all the players.

Do you see particular sectors leaning more toward AMD or more toward NVIDIA? Any trends emerging there?
We are seeing broad demand across all industries.

We follow customer demand. The scale of enterprise workloads today creates room for both NVIDIA and AMD to grow.

Advertisment

The new Helios rack architecture supports 72 GPUs per rack with liquid cooling, Ultra Accelerator Link (UAL) networking, and UALink memory sharing. From a systems-engineering point of view, how does this design actually translate into lower latency, higher energy efficiency, and faster model orchestration?
Closed-loop, non-evaporative cooling provides a major density improvement. Higher density allows racks to be positioned physically closer, directly reducing network-tier latency. Cooling may appear mundane, but it has a profound impact on performance.

Helios is closely tied to Acceleron. By unifying the NIC and the Data Processing Unit (DPU), we create a single low-latency path for packet movement across nodes. When you combine low-latency networking, high-density placement, and unified communication paths, you obtain substantial improvements in workload performance—especially for tightly coupled compute tasks like distributed training, multi-node inference, and large language model processing.

Unifying the NIC and the DPU transforms how fast tightly coupled AI workloads can communicate across nodes.

Advertisment

How is Oracle approaching open standards and heterogeneous computing? How does this reduce lock-in and improve flexibility for enterprise developers on OCI?
Oracle has a long-standing commitment to open standards and open-source ecosystems. Customers want to run workloads wherever their AI pipelines, data strategies, or regulatory constraints lead them.

This is clearly evident in our multi-cloud strategy. Oracle Database, along with the dedicated interconnects between OCI and Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), enables customers to run workloads wherever their AI pipelines or enterprise architectures require. This principle continues to shape our strategy.

With the launch of the Oracle AI Data Platform, how does OCI bring together GPU-rich infrastructure, unified data, and agentic AI workloads?
The central idea is to enable the unified representation of structured and unstructured data within a real semantic context. Embeddings, vector search, and metadata-driven processing operate across the Autonomous AI Database and the Autonomous AI Lakehouse.

Advertisment

This unification unlocks a range of downstream AI use cases. When data is properly embedded, governed, and prepared, GPUs receive meaningful context. Historically, data silos and compliance barriers have limited this potential. The AI Data Platform combines structured and unstructured data. Hardware matters, but without well-structured and well-managed data, no AI system can deliver business value.

Enterprises want freedom. Our responsibility is to let them run workloads wherever their architecture requires.

As enterprises distribute AI workloads across multiple clouds, how does Oracle ensure consistent performance, cost efficiency, and data governance parity across OCI and partner clouds?
This is where our multicloud database architecture becomes critical. We deploy Oracle Exadata racks directly inside partner-cloud facilities, operating them as Oracle-managed child sites. These racks run the same Exadata hardware that customers use on-premises.

This ensures identical performance, management, and tooling across environments. It creates a predictable, low-risk migration path. When enterprises migrate workloads between these environments, the system's behaviour remains identical.

Once the data plane is consistent, enterprises gain the freedom to use any cloud’s AI pipeline—whether that is Amazon Bedrock, Microsoft Copilot, Google Vertex, or Google Gemini—while keeping their Oracle database architecture unchanged. This combination of consistency and choice is extremely important for enterprises scaling AI.

Sustainability is becoming critical for AI-scale workloads. What is Oracle doing on this front?
Sustainability is deeply embedded in our systems design and operational planning. We expand capacity based on very specific demand forecasts to ensure efficiency across the entire infrastructure lifecycle.

We collaborate with local energy utilities and, in certain cases, develop on-site power generation facilities. Cooling innovations like closed-loop systems improve density and reduce water and energy usage. Across the board, our goal is to make sure high-density compute infrastructure grows responsibly, without compromising environmental considerations or operational reliability.

Are you working with AMD or NVIDIA at the processor or architecture level to reduce GPU power consumption and heat generation?
Yes. There is very clear alignment between our financial interests and environmental interests. Reducing power consumption reduces operational cost and environmental load, so this alignment drives innovation.
Demand for GPUs is enormous—both inside and outside Oracle—so efficient utilisation, high performance per watt, and improved thermal characteristics are essential. We work continuously with our partners across architecture, firmware, and systems engineering to advance efficiency.

India is one of the fastest-growing markets for cloud and AI. How is Oracle adapting its AI-infrastructure and multicloud strategy for the country?
India is a large and extremely important market for Oracle. We have been in operation for over thirty years, serving more than 5,000 customers and partnering with over 500 companies.

We already operate two OCI regions in India, and we are expanding further, including for AI workloads. We are working with Google Cloud, Microsoft Azure, and AWS to bring multicloud and AI capabilities deeper into India. Over the next twelve months, we plan a series of India-specific expansions, including localised regions that will support AI training, inference and enterprise workloads at scale.

In terms of development, what is your perspective on India? What role does India play in the Oracle Cloud Infrastructure roadmap?
Oracle operates nine product-development centres in India, located in Bengaluru, Hyderabad, Chennai, Gandhinagar, Noida, Mumbai, Pune, Kolkata, and Thiruvananthapuram. These centres play a significant role in global product development.