How AWS custom silicon is reshaping AI economics in India

Satinder Pal Singh explains how AWS custom silicon help cut AI cost and latency, enabling Indian enterprises to scale image, agentic and multimodal workloads.

author-image
Thomas George
New Update
Satinder-Pal-Singh Interview_AWS custom silicon: turning AI into infrastructure

When Zomato processes 10,000 restaurant images a day using AI chips with better price performance and lower latency, it is not just about prettier photos—it is about democratising technology for small businesses across India. A street-side eatery armed with only a mobile phone camera can now compete with upmarket restaurants that rely on professional photography equipment, thanks to image upscaling powered by custom silicon.

Advertisment

This is the new reality of India’s AI revolution: custom chips are no longer the exclusive domain of tech giants, but tools that enable businesses of every scale to compete on innovation rather than capital. Satinder Pal Singh, Director, Solution Architecture, AWS India and South Asia, explains how AWS’s custom silicon strategy is reshaping the competitive landscape—delivering not just incremental gains, but order-of-magnitude improvements in price-performance that are unlocking entirely new categories of AI applications across Indian enterprises. Excerpts from this exclusive interaction with Voice&Data.

The Evolution of Custom Silicon

AWS’s journey into custom silicon began more than a decade ago, driven by a clear objective: enabling more workloads to run efficiently on cloud infrastructure. “We started in 2013 with Nitro,” explains Singh. “Nitro offloaded hypervisor capabilities and storage and networking functions, allowing customers to run workloads with consistent performance. That really opened up the kinds of workloads that could be supported on the cloud.”

This foundation paved the way for Graviton, AWS’s Arm-based, general-purpose compute processor, launched in 2018. Now in its fourth generation, with the fifth recently announced, Graviton has achieved significant market penetration. “For the last three years, more than 50% of the compute that we have launched in our regions worldwide has used Graviton,” Singh notes. The latest Graviton4 processor delivers 30% better performance, 50% more compute cores, and 75% more memory than its predecessor.

Advertisment

AI Accelerators: Trainium and Inferentia

The progression from general-purpose computing to AI-specific accelerators reflects AWS’s response to the exponential growth in machine learning workloads. The Trainium family, purpose-built for training AI models, has advanced rapidly. Trainium2 instances deliver approximately 20 petaflops of performance each, while the Ultra Server configuration provides over 80 petaflops in a single unit.

The scale is staggering. “AWS and Anthropic’s Project Rainier uses over 500,000 Trainium2 chips,” Singh reveals. “It is one of the world’s largest AI supercomputers, built entirely on Trainium.” The recently announced Trainium3 Ultra Server delivers 4.4 times the performance, four times the memory, and operates with four times less power, while improving latency—reaching 360 petaflops.

For inference workloads, Inferentia2 delivers 10x lower latency than its predecessor, addressing the critical need for real-time AI applications.

Advertisment

Choice Architecture: Matching Chip to Workload

When asked about comparisons with Nvidia’s H100, Singh emphasises AWS’s philosophy of providing choice rather than promoting a single solution. “Our mental model is to provide choice to our customers. We want to provide the right tool for the job. This broad selection allows customers to choose the right workload and chipset for their needs. In certain cases, it can be Nvidia. In other cases, it can be Trainium or Inferentia.”

This approach reflects a maturing cloud market, where one-size-fits-all solutions are giving way to more nuanced architectural decisions.

Table-AWS-custom-silicon

India Outcomes: Performance and Cost at Scale

The adoption of custom silicon by Indian enterprises delivers tangible business outcomes. Paytm runs Graviton on databases powering its payment gateway for real-time transactions, reporting a 30% performance improvement and a 35% cost reduction. Zomato has delivered similar gains through its data lake platform, running Trino and Druid on Graviton2 to improve performance by 25% while reducing costs by 30%.

Advertisment

Perhaps the most compelling example is Zomato’s use of Inferentia for image upscaling. The company maintains a policy against AI-generated images, aiming to present authentic restaurant photos. However, smaller restaurants often lack professional photography equipment. “How do they democratise the technology?” Singh asks. “They used Inferentia for image upscaling, so that the real image looks more appealing to the user.” By processing 10,000 images daily, 50% better price performance and 25% lower latency.

In conversational AI, Gupshup uses Amazon SageMaker and Trainium to cut model training time from days to hours.

Multimodal and Reasoning-Intensive Workloads

The announcement of Nova Omni marks a significant advance in multimodal AI capabilities. “First in the industry, multimodal reasoning capabilities,” Singh emphasises. “You can have video, audio, text, and images as inputs—all four modes—with reasoning on top, outputting either images or text.”

Advertisment

Combined with Trainium3’s performance improvements, these capabilities enable previously impractical use cases. Singh points to a demonstration of real-time video generation using Trainium3, illustrating how improved latency and performance unlock new possibilities.

Agentic AI Moves Into Traditional Enterprises

Whilst large digital natives and fintech companies are obvious beneficiaries, Singh also highlights that traditional enterprises are leveraging these technologies. Apollo Tyres offers a striking example. During tyre manufacturing, the “dry cycle time” between curing presses represents underutilised capacity. When problems occur, traditional root cause analysis requires examining millions of data points and coordinating across teams—often a seven-hour process.

Using Amazon Bedrock’s agentic capabilities, Apollo Tyres developed a “Manufacturing Reasoner” that reduced the dry cycle time to 10 minutes. “Think of the impact on dry cycle time and overall equipment effectiveness,” Singh notes. Similarly, DTDC launched DIVA 2.0, an agentic chatbot that helps customers track shipments and obtain pricing information, reporting a 50% efficiency improvement.

Advertisment

Prime Day Proof: Inference at Massive Scale

Looking ahead, Singh sees generative AI adoption accelerating across sectors in the next two years. “Whether it is fintech, digital natives, or traditional enterprises, they are seeing the benefits of using generative and agentic AI technologies. We are at very early stages, and with agentic AI, more such use cases will emerge.”

The retail sector offers a glimpse of this future. Amazon’s Rufus, a generative AI shopping assistant, runs on Inferentia. During US Prime Day sales, over 80,000 instances of Inferentia and Trainium powered the experience, handling queries on product choices, reviews, and specifications.

Silicon Strategy: Cloud’s Next Platform Shift

The custom silicon strategy marks a fundamental shift in how cloud infrastructure is conceived and delivered. By developing purpose-built chips optimised for specific workload types, AWS is not merely competing on price; it is enabling entirely new categories of applications.

Advertisment

For Indian enterprises, this translates into practical advantages: reduced costs, improved performance, and access to capabilities previously available only to the largest technology companies. As Singh concludes, “We are thinking holistically about how to remove undifferentiated heavy lifting that our customers are doing and innovating on their behalf, giving them all the toolsets with the right choice so they make the right decisions.”

In an era where computational efficiency and AI capabilities increasingly determine competitive advantage, AWS’s silicon strategy offers Indian businesses a credible path to innovation at scale.

The interview was edited with limited use of AI-based tool.