Go Back

Explore challenges and solutions in AI chip development

Download eBook

Innovate Faster with Synopsys Multi-Die Solution

Accelerating success from early architecture to manufacturing.

Download eBook

Explore Silicon Design, Verification & Manufacturing

Synopsys is a leading provider of electronic design automation solutions and services.

Simpleware Software

Virtual Prototyping

Synopsys Cloud

Unlimited access to EDA software licenses on-demand

Request a Free Trial

Explore Silicon IP

Synopsys is a leading provider of high-quality, silicon-proven semiconductor IP solutions for SoC designs.

Synopsys IP Portfolio

Download Brochure

Synopsys IP Technical Bulletin

Read Latest Issue

Explore Systems Verification and Validation

Synopsys is a leading provider of hardware-assisted verification and virtualization solutions.

System Test Generation

Company Overview

Synopsys and Ansys are Now United

Learn More

Synopsys Blog

Insights that shape the future.

Visit Our Blog

Enhancing Computer Vision with Deep-Learning Models

Gordon Cooper

Feb 28, 2023 / 4 min read

Table of Contents

Table of Contents
Attention-Based Networks Deliver Benefit of Contextual Awareness
Optimizing Performance of Transformers and CNNs with NPU IP
Conclusion

Is that a dog in the middle of the street? Or an empty box? If you’re riding in a self-driving car, you’ll want the object detection and collision avoidance systems to correctly identify what might be on the road ahead and direct the vehicle accordingly. Inside modern vehicles, deep learning models play an integral role in the cars’ computer vision processing applications.

With cameras becoming pervasive in so many systems, cars aren’t the only ones taking advantage of AI-driven computer vision technology. Mobile phones, security systems, and camera-based digital personal assistants are just a few examples of camera-based devices that are already using neural networks to enhance image quality and accuracy.

While the computer vision application landscape has traditionally been dominated by convolutional neural networks (CNNs), a new algorithm type—initially developed for natural language processing such as translation and question answering—is starting to make inroads: transformers. A deep-learning model that processes all input data simultaneously, transformers likely won’t completely replace CNNs but will be used alongside them to enhance the accuracy of vison processing applications.

Transformers have been in the news lately thanks to ChatGPT, a transformer-based chatbot launched in November 2022 by OpenAI. While ChatGPT is a server-based transformer requiring 175 billion parameters, you’ll learn more in this blog post about why transformers are also ideal for embedded computer vision. Read on for insights into how transformers are changing the direction of deep-learning architectures and for techniques to optimize the implementation of these models to derive optimal results.

Attention-Based Networks Deliver Benefit of Contextual Awareness

For more than 10 years now, CNNs have been the go-to deep-learning model for vision processing. As they’ve evolved, CNNs have accurately been applied for image classification, object detection, semantic segmentation (grouping or labeling every pixel in an image), and panoptic segmentation (identifying object locations as well as grouping or labeling every pixel in every object). However, without any modifications of transformers other than replacing language patches with image patches, the transformer has shown it can beat CNNs in accuracy.

It was 2017 when the Google Research team shared an article introducing the transformer, defining it as “a novel neural network architecture based on a self-attention mechanism that we believe to be particularly well suited for language understanding.” Fast-forward to 2020, when Google Research scientists published an article on the vision transformer (ViT), a model based on the original transformer architecture. According to the article, the ViT “demonstrates excellent performance when trained on sufficient data, outperforming a comparable state-of-the-art CNN with four times fewer computational resources.” Indeed, these transformers, which need to be trained with very large data sets, showed how adept they are at vision tasks such as image classification and object detection.

Since transformers can understand context, they’re good at learning complex patterns for accurate object detection.

A key aspect of transformers that aids in their proficiency for vision applications is their property attention mechanism that enables the models to understand context. Like CNN, a transformer can detect that the object on the road ahead is an injured dog, not a cardboard box. It does so, but devoting more focus to the small, but important, parts of the data–the item in the road–and not the less useful pixels that represent the rest of the road. In other words, not all pixels are treated equally, making transformers better at learning more complex patterns than CNNs can (CNNs typically address a frame of data without knowing what came before or after.) As research and development continues, transformer model sizes are now similar to CNN model sizes.

While the frames-per-second performance depends on the hardware upon which the models are run, CNNs tend to perform at faster rates than transformers, which require many more computations. However, transformers are poised to catch up. GPUs can support both, but for real-world applications that need the highest performance in the smallest area with the least power, dedicated AI accelerators (like NPUs or neural processing units) are a better option.

For greater inference efficiency, a vision processing application can utilize both CNNs and transformers. Full visual perception calls for knowledge that may not be easily acquired by a vision-only model. Multi-modal learning provides a deeper understanding of visual information. Also, attention-based networks like transformers are well suited to applications that integrate multiple sensors, such as automotive.

Optimizing Performance of Transformers and CNNs with NPU IP

Transformers consist of a handful of operations:

Matrix multiplication
Element-wise addition
Softmax mathematical function
L2 normalization
Activation functions

While most current AI accelerators are optimized for CNNs, not all of them are ideal for transformers. Transformers require computational power to perform voluminous calculations and to support their attention mechanism. The Synopsys ARC® NPX6 NPU IP is an example of an AI accelerator that can handle CNNs and transformers. The ARC NPX6 NPU IP’s computation units include a convolution accelerator for matrix-matrix multiplications that are essential to both deep learning models, as well as a tensor accelerator for transformer operations and activation functions. The IP delivers up to 3,500 TOPS performance and industry-leading power efficiency of up to 30 TOPS/Watt. Design teams can also accelerate their application software development with the Synopsys MetaWare MX Development Toolkit. The toolkit provides a comprehensive software programming environment that includes a neural network software development kit and support for virtual models.

Conclusion

Natural language processing applications have enjoyed the computational prowess of transformers for several years. Now, real-time vision processing applications are getting in on the action, taking advantage of the attention-based network’s capacity for providing contextual awareness for greater accuracy. From smartphones to security systems and cars, camera-based products are growing increasingly adept at delivering high-quality images. Adding transformers to the deep-learning infrastructure of embedded vision camera systems will only give rise to even sharper images and more accurate object detection.

Continue Reading

The Integrated Design Challenge: Developing Chip, Software, and System in Unison

Blog

6 min read / Jun 25, 2025

The Integrated Design Challenge: Developing Chip, Software, and System in Unison

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design, Design Technology Co-Optimization, Silicon IP

Read Article

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

Blog

3 min read / May 28, 2025

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

By Synopsys Editorial Staff

Tags: Verification Central, Multi-Die System, AI & Machine Learning, Memory, Chip Design Insights, Design, Interface IP, Verification IP, HPC, Data Center, Silicon IP, Verification, 3DIC Design

Read Article

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

Blog

3 min read / May 19, 2025

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Addressing Hardware Failures and Silent Data Corruption in the AI Infrastructure Buildout

Blog

4 min read / Jul 23, 2025

Addressing Hardware Failures and Silent Data Corruption in the AI Infrastructure Buildout

By Shankar Krishnamoorthy , Yervant Zorian

Tags: Data Center, AI & Machine Learning, Silicon Lifecycle Management, Chip Design Insights, Design, HPC, Data Center, Silicon IP

Read Article

Solving Analog Design Challenges to Power Our Digital World

Blog

4 min read / Jul 08, 2025

Solving Analog Design Challenges to Power Our Digital World

By Sumit Vishwakarma

Tags: AI & Machine Learning, Memory, RF Design, Insights, Energy-Efficient SoCs, Cloud, AMS Verification, Signal & Power Integrity, Chip Design Insights, Design, AMS Simulation, Cloud Insights, Verification, Analog Design

Read Article

RTL Signoff vs. Functional Signoff: What’s the Difference?

Blog

5 min read / Jul 01, 2025

RTL Signoff vs. Functional Signoff: What’s the Difference?

By Bradley Geden , Manoz Palaparthi

Tags: Verification Central, Multi-Die System, RTL Synthesis, Static & Formal Verification, AI & Machine Learning, Debug, Physical Verification, Test, Simulation, Energy-Efficient SoCs, Signoff, Chip Design Insights, Design, Verification, Formal Verification

Read Article

Blog

6 min read / Jun 25, 2025

The Integrated Design Challenge: Developing Chip, Software, and System in Unison

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design, Design Technology Co-Optimization, Silicon IP

Read Article

Blog

3 min read / May 28, 2025

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

By Synopsys Editorial Staff

Tags: Verification Central, Multi-Die System, AI & Machine Learning, Memory, Chip Design Insights, Design, Interface IP, Verification IP, HPC, Data Center, Silicon IP, Verification, 3DIC Design

Read Article

Blog

3 min read / May 19, 2025

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article

Blog

4 min read / Jul 23, 2025

Addressing Hardware Failures and Silent Data Corruption in the AI Infrastructure Buildout

By Shankar Krishnamoorthy , Yervant Zorian

Tags: Data Center, AI & Machine Learning, Silicon Lifecycle Management, Chip Design Insights, Design, HPC, Data Center, Silicon IP

Read Article

Blog

4 min read / Jul 08, 2025

Solving Analog Design Challenges to Power Our Digital World

By Sumit Vishwakarma

Read Article

Blog

5 min read / Jul 01, 2025

RTL Signoff vs. Functional Signoff: What’s the Difference?

By Bradley Geden , Manoz Palaparthi

Read Article

Blog

6 min read / Jun 25, 2025

The Integrated Design Challenge: Developing Chip, Software, and System in Unison

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design, Design Technology Co-Optimization, Silicon IP

Read Article

Blog

3 min read / May 28, 2025

High Bandwidth Memory (HBM) at the AI Crossroads: Customization or Standardization?

By Synopsys Editorial Staff

Tags: Verification Central, Multi-Die System, AI & Machine Learning, Memory, Chip Design Insights, Design, Interface IP, Verification IP, HPC, Data Center, Silicon IP, Verification, 3DIC Design

Read Article

Blog

3 min read / May 19, 2025

Advancing Chip Design with AI: Synopsys AI Collaboration Featured at Microsoft Build

By Synopsys Editorial Staff

Tags: AI & Machine Learning, Chip Design Insights, Design

Read Article