AWS Trainium: New ML Training Chip Launched

During its yearly re:Invent conference for developers, Amazon Web Services (AWS) unveiled AWS Trainium today. This represents the company’s latest generation of a custom-designed chip specifically for the training of machine learning models. AWS asserts that it will deliver superior performance compared to other cloud providers, and will be compatible with TensorFlow, PyTorch, and MXNet.

The service will be accessible through Amazon Elastic Compute Cloud (EC2) instances and integrated within Amazon SageMaker, AWS’s comprehensive machine learning platform.

These new instances, powered by these specialized chips, are scheduled for release in the coming year.

The primary benefits of these custom chips center around enhanced speed and reduced expenses. AWS projects a 30% increase in throughput and a 45% reduction in cost-per-inference when contrasted with standard AWS GPU instances.

Furthermore, AWS is collaborating with Intel to introduce EC2 instances utilizing Habana Gaudi chips for machine learning training. These instances, also planned for release next year, are anticipated to provide up to 40% improved price/performance relative to the existing GPU-based EC2 instances used for machine learning. These chips will offer support for both TensorFlow and PyTorch.

These innovative chips are slated to become available on the AWS cloud platform during the first six months of 2021.

These new developments build upon AWS Inferentia, which the company introduced at the previous year’s re:Invent event. Inferentia serves as the inference counterpart to these machine learning components, also leveraging a custom chip.

Notably, Trainium will employ the same Software Development Kit (SDK) as Inferentia.

“Inferentia tackled the expense associated with inference, which can account for as much as 90% of machine learning infrastructure costs, but many development groups also face constraints due to limited machine learning training budgets,” explains the AWS team. “This restricts the extent and frequency of training required to refine their models and applications. AWS Trainium resolves this issue by offering the highest performance and lowest cost for machine learning training within the cloud. With both Trainium and Inferentia, customers will have a complete workflow for machine learning computation, from scaling training workloads to deploying accelerated inference.”

Topics

More

AWS Trainium: New ML Training Chip Launched

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization