Qualcomm takes on data centre AI with AI200 and AI250 accelerators

Qualcomm announces AI200 and AI250 accelerator cards for data centers, boosting generative AI inference performance and lowering total cost of ownership.

28 Oct 2025 13:48 IST

New Update

Qualcomm Technologies has announced its entry into the high-performance data center AI market with the introduction of the Qualcomm AI200 and Qualcomm AI250 accelerator cards and racks. These new solutions focus on generative AI inference, specifically targeting Large Language Models (LLMs) and Large Multimodal Models (LMMs) with an emphasis on achieving low total cost of ownership (TCO) for enterprise customers.

The Qualcomm AI200 is a purpose-built rack-level solution designed for optimized performance and cost. It offers a substantial 768 GB of LPDDR per card, providing high memory capacity crucial for large AI models. This memory scale gives data centers greater flexibility for inference workloads.

A major technical highlight is the Qualcomm AI250. This solution debuts a new memory architecture based on near-memory computing. This approach delivers a generational jump in performance and power usage for AI inference, resulting in over ten times higher effective memory bandwidth. This technical advance supports disaggregated AI inferencing, allowing hardware resources to be utilized more efficiently to meet both cost and performance demands.

Both rack systems include standard features for modern data centers. They use direct liquid cooling for thermal management and offer both PCIe for scaling within a rack and Ethernet for scaling across multiple racks. For security, the systems support confidential computing for secure AI workloads, all within a 160 kW rack-level power envelope.

Qualcomm is also providing a comprehensive AI software stack optimized for inference. This hyperscaler-grade stack supports major machine learning frameworks and includes features for generative AI models, such as disaggregated serving. Developers can streamline their workflow, benefiting from easy model onboarding and one-click deployment for Hugging Face models via Qualcomm's Efficient Transformers Library and AI Inference Suite. The company states this enables frictionless adoption for developers and enterprises.

The introduction of the AI200 and AI250 marks the start of Qualcomm's committed multi-generation roadmap for data center AI inference. The AI200 is slated for commercial availability in 2026, with the AI250 following in 2027. Qualcomm plans an annual cadence for its future products in this sector.

Advertisment