Go Back

Explore challenges and solutions in AI chip development

Download eBook

Innovate Faster with Synopsys Multi-Die Solution

Accelerating success from early architecture to manufacturing.

Download eBook

Explore Silicon Design, Verification & Manufacturing

Synopsys is a leading provider of electronic design automation solutions and services.

Simpleware Software

Virtual Prototyping

Synopsys Cloud

Unlimited access to EDA software licenses on-demand

Request a Free Trial

Explore Silicon IP

Synopsys is a leading provider of high-quality, silicon-proven semiconductor IP solutions for SoC designs.

Synopsys IP Portfolio

Download Brochure

Synopsys IP Technical Bulletin

Read Latest Issue

Explore Systems Verification and Validation

Synopsys is a leading provider of hardware-assisted verification and virtualization solutions.

System Test Generation

Company Overview

Success Stories

Explore our success stories.

Learn More

Synopsys Blog

Insights that shape the future.

Visit Our Blog

Designing Energy-Efficient AI Accelerators for Data Centers and the Intelligent Edge

William Ruby

Jul 26, 2023 / 4 min read

Table of Contents

Table of Contents
Optimizing Power for Billion-Plus Gate Designs
Exceeding Energy-Efficiency and Performance Targets
Designing Differentiated Silicon for Edge AI Inference

Artificial intelligence (AI) accelerators are deployed in data centers and at the edge to overcome conventional von Neumann bottlenecks by rapidly processing petabytes of information. Even as Moore’s law slows, AI accelerators continue to efficiently enable key applications that many of us increasingly rely on, from ChatGPT and advanced driver assistance systems (ADAS) to smart edge devices such as cameras and sensors.

Although AI accelerators are typically 100x to 1,000x more efficient than general-purpose systems, the computational resources needed to generate best-in-class AI models doubles every 3.4 months. Moreover, training a single deep-learning model such as ChatGPT’s GPT3 creates approximately 500 metric tons of CO₂, the equivalent of over a million miles driven by an average gasoline-powered vehicle! To help reduce global carbon emissions, the U.S. Department of Energy (DoE) recently recommended a 1,000x improvement in semiconductor energy efficiency.

Achieving optimal performance-per-watt—whether for AI training in the data center or inference at the edge—is understandably a top priority for the semiconductor industry. In addition to minimizing environmental impact, reducing energy consumption lowers operating costs, maximizes performance within limited power budgets, and helps mitigate thermal challenges. Read on to learn how chip designers—including edge AI chip developer SiMa.ai—are leveraging end-to-end power analysis solutions to build a new generation of more energy-efficient AI accelerators.

Optimizing Power for Billion-Plus Gate Designs

An end-to-end approach to energy efficiency for AI accelerators must start at the architectural and micro-architectural levels during the earliest stages of the design flow and conclude at signoff. That’s why AI chip designers rely on architectural exploration platforms to map and evaluate power, performance, and area (PPA) tradeoffs for specific training or inference applications while proactively identifying critical vectors for downstream analysis.

As AI hardware typically consists of large arrays with thousands of tiles (processing elements), billion-plus-gate designs require multi-domain hardware and software power verification to minimize energy consumption and leakage. However, analyzing crucial power blocks and time windows requires advanced emulation systems to run billions of cycles and rapidly deliver multiple—and accurate—iterations. Only after completing this step can register transfer level (RTL) power analysis and physical implementation tools effectively optimize dynamic (gate switching) and static (leakage) power dissipation.

To consistently deliver accurate results, RTL power analysis tools for AI chip design should include the following capabilities:

Timing-driven fast synthesis: Internal power calculation errors are often caused by fanout-based fast synthesis tools that fail to properly size cells following timing constraints. Like their downstream place-and-route counterparts, fast synthesis embedded in RTL power analysis tools must be timing driven.
Physically aware fast synthesis: RTL power analysis tools should be “physical aware” and capable of obtaining precise net capacitance values by executing first-pass placement of the cells in the design, as well as global routing. Unlike a fanout-based approach, physically aware capacitance estimation results in a unique and accurate value for each net.
Signoff-quality power computation engine: Traditional RTL power analysis tools using word-level logic inferencing for fast synthesis can only apply heuristic—and therefore inaccurate—methods for glitch power computation. To accurately calculate glitch power (which can potentially consume up to 40% of a chip’s total power) and reduce highly replicated tiles, RTL power analysis tools must have a signoff-quality power analysis engine, a netlist level design representation, and an integrated timing engine.

After completing RTL power analysis and reduction, physical implementation (synthesis and place and route) tools can be used to further optimize PPA. To ensure reliability, scalability, and a frictionless user experience, these implementation tools should include a single, integrated data model architecture, interleaved engines, and a unified shell. Just as importantly, implementation tools should be capable of accurately modeling advanced node effects and glitch power to accelerate engineering change orders (ECOs) and final design closure.

Exceeding Energy-Efficiency and Performance

Synopsys offers a comprehensive end-to-end power solution to help AI chip designers cost-effectively meet or exceed ambitious performance and energy-efficiency goals while accelerating time to market. Used at the very beginning of the design flow, Synopsys Platform Architect™ provides AI chip designers with SystemC™ transaction-level modeling (TLM) tools and efficient methods to rapidly model, analyze, and optimize complex silicon architectures. Synopsys ZeBu® Empower, a fast power profiler, is used for the next stage of the AI chip design process: analyzing and debugging energy consumption—based on hundreds of millions of cycles—for real software workloads.

Leading semiconductor companies have significantly reduced power draw with Synopsys ZeBu Empower, including SiMa.ai, a Silicon Valley-based AI chip startup that designs high-performance, low-energy AI chips for the intelligent edge. Specifically, the company realized a 2.5x frames per second (FPS) per watt improvement for its SiMa.ai™ Low Power MLSoC™. During a presentation at the SNUG Silicon Valley 2023 conference this spring, Sounil Biswas, director of silicon engineering at SiMa.ai, noted that subsequent silicon validation demonstrated excellent correlation between Synopsys ZeBu Empower data and board measurements.

To complement ZeBu Empower and enable RTL design for low power, we offer Synopsys PrimePower RTL, an RTL power analysis and reduction tool that consistently achieves accurate results (within +/- 15% of post-route implementation) by pairing timing-driven, physically aware synthesis capabilities with an integrated computation engine. Synopsys PrimePower RTL also provides step-by-step guidance to help AI chip designers further minimize glitching and reduce overall power consumption.

Additional PPA optimization is achieved with Synopsys Fusion Compiler™, a comprehensive and integrated RTL-to-GDSII implementation system. After passing this milestone, the AI chip design is analyzed with Synopsys PrimePower, the golden power signoff solution. Certified by leading foundries worldwide down to 3nm processes, Synopsys PrimePower delivers fast runtime performance with distributed processing, achieving high accuracy at signoff within a few percent of SPICE and silicon measurements.

Designing Differentiated Silicon for Edge AI Inference

AI accelerators enable many popular applications to quickly analyze massive amounts of information and accurately infer results in milliseconds. At the same time, achieving optimal performance-per-watt remains a top priority for chip designers. This is especially true at the edge, where performance is often limited by minimal power envelopes and smaller die sizes.

However, these constraints create new opportunities for semiconductor companies to design differentiated silicon by precisely calibrating PPA to match the specific requirements of low-latency, high-bandwidth applications. For example, autonomous navigation demands a computational response latency limit of 20μs, while voice and video assistants must understand spoken keywords in less than 10μs and hand gestures in a few hundred milliseconds. To successfully implement PPA tradeoffs, chip designers should adopt a holistic approach to power optimization by leveraging an end-to-end solution that spans early architectural exploration to golden power signoff.

You can learn more about Synopsys energy-efficient SoCs solutions here.

Continue Reading

Synopsys Interconnect IPs Enabling Scalable Compute Clusters

Blog

3 min read / May 18, 2025

Synopsys Interconnect IPs Enabling Scalable Compute Clusters

By Neeraj Paliwal

Tags: Data Center, AI & Machine Learning, Chip Design Insights, Interface IP, HPC, Data Center, Silicon IP

Read Article

AI in Engineering: How Technology is Reshaping Engineering Roles and Skills

Blog

4 min read / May 07, 2025

AI in Engineering: How Technology is Reshaping Engineering Roles and Skills

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

How AgentEngineer™ Technology Will Transform Engineering Workflows

Blog

3 min read / May 06, 2025

How AgentEngineer™ Technology Will Transform Engineering Workflows

By Sassine Ghazi

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

Blog

3 min read / May 28, 2025

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

By Synopsys Editorial Staff

Tags: Verification Central, Multi-Die System, AI & Machine Learning, Memory, Chip Design Insights, Design, Interface IP, Verification IP, HPC, Data Center, Silicon IP, Verification, 3DIC Design

Read Article

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

Blog

3 min read / May 19, 2025

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Skymizer Reduces Verification Cycles for AI Accelerator IP Development by 33% with Synopsys HAPS Prototyping

Blog

4 min read / May 19, 2025

Skymizer Reduces Verification Cycles for AI Accelerator IP Development by 33% with Synopsys HAPS Prototyping

By Simon Han

Tags: AI & Machine Learning, Physical Verification, Prototyping, Chip Design Insights, Design, Emulation, Verification

Read Article

Blog

3 min read / May 18, 2025

Synopsys Interconnect IPs Enabling Scalable Compute Clusters

By Neeraj Paliwal

Tags: Data Center, AI & Machine Learning, Chip Design Insights, Interface IP, HPC, Data Center, Silicon IP

Read Article

Blog

4 min read / May 07, 2025

AI in Engineering: How Technology is Reshaping Engineering Roles and Skills

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Blog

3 min read / May 06, 2025

How AgentEngineer™ Technology Will Transform Engineering Workflows

By Sassine Ghazi

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Blog

3 min read / May 28, 2025

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

By Synopsys Editorial Staff

Tags: Verification Central, Multi-Die System, AI & Machine Learning, Memory, Chip Design Insights, Design, Interface IP, Verification IP, HPC, Data Center, Silicon IP, Verification, 3DIC Design

Read Article

Blog

3 min read / May 19, 2025

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Blog

4 min read / May 19, 2025

Skymizer Reduces Verification Cycles for AI Accelerator IP Development by 33% with Synopsys HAPS Prototyping

By Simon Han

Tags: AI & Machine Learning, Physical Verification, Prototyping, Chip Design Insights, Design, Emulation, Verification

Read Article

Blog

3 min read / May 18, 2025

Synopsys Interconnect IPs Enabling Scalable Compute Clusters

By Neeraj Paliwal

Tags: Data Center, AI & Machine Learning, Chip Design Insights, Interface IP, HPC, Data Center, Silicon IP

Read Article

Blog

4 min read / May 07, 2025

AI in Engineering: How Technology is Reshaping Engineering Roles and Skills

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Blog

3 min read / May 06, 2025

How AgentEngineer™ Technology Will Transform Engineering Workflows

By Sassine Ghazi

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article