AWS Trainium2 Clusters to Power Apple Intelligence

Brian Pereira

4 months ago

Even as Apple begins to roll out updates with Apple Intelligence for its devices, you’d be surprised to know that the massive computing power required for training and inferencing of its large language models comes from AWS (Amazon Web Services). At the AWS re:Invent 2024 event in Las Vegas last week, AWS announced updates to its own silicon chips and technology for processing AI workloads, notably, AWS Trainium 2 (TR2) processors. This also makes AWS a formidable competitor to NVIDIA, although AWS is an NVIDIA partner.

Apple Intelligence is a personal intelligence technology that is now being integrated into Apple’s products with OS updates. According to Apple, the AI-based technology draws on the user’s content without allowing anyone to access their personal data. Built on top of Apple’s own LLMs, it offers a generative AI experience for Apple users on their devices.

During a keynote address at the 13th edition of AWS re:Invent in Las Vegas last week, Benoit Dupin, Sr. director of machine learning and artificial intelligence at Apple said “We work with AWS services across virtually all phases of our AI and ML lifecycle.”

To date, Apple has been using AWS services with Inferentia 2 and Graviton 3 processors. Apple Intelligence is powered by its LLMs, diffusion models, and adaptors, which run on both devices and servers, Dupin said.

However, to scale the infrastructure behind Apple Intelligence, Apple is exploring AWS’s Trainium2 (TR2) chips that will be strung together in an EC2 UltraCluster that includes hundreds of thousands of chips to form a massive supercomputer.

Dupin said Apple depends on AWS Xcelerators through its Trainium2 clusters and is in the early stages of exploring Trn2 chips. After deployment, it expects 50% improvement in training to support its scaling.

At re:Invent 2024, AWS announced the general availability (GA) of TR2 chips and said TR3 will be coming next year. An AWS spokesperson said TR3 chips will be manufactured on a 3 nm process and offer 2X more computing than TR2.

AWS Trainium chips are a family of AI chips purpose-built by AWS for AI training and inference to deliver high performance while reducing costs.

According to an AWS statement, together with Anthropic, AWS is building an EC2 UltraCluster of Trn2 UltraServers – named Project Rainier – containing hundreds of thousands of Trainium2 chips and more than 5x the number of exaflops used to train their current generation of leading AI models.

“Trainium2 is purpose-built to support the largest, most cutting-edge generative AI workloads, for both training and inference and to deliver the best price-performance on AWS,” said David Brown, vice president of Compute and Networking at AWS. “With models approaching trillions of parameters, we understand customers also need a novel approach to train and run these massive workloads. New Trn2 UltraServers offer the fastest training and inference performance on AWS and help organizations of all sizes to train and deploy the world’s largest models faster and at a lower cost.”