Cloud native EDA tools & pre-optimized hardware platforms
Maxim Monadeev, VLSI architect at NeuReality, co-authored this article.
Say you’re a small company with a big idea, but are just getting started. Because your first silicon AI chip design is in development, you have a limited budget to get your offering off the ground. You also know that, given competitive pressures, the faster you get to market, the better. In this scenario, any tools or technologies that can help accelerate the AI chip design and verification process can also provide a foundation for your company’s success.
For Caesarea, Israel-based NeuReality, which develops purpose-built AI inference platforms, its helping hand came by way of cloud-based chip design emulation, specifically Synopsys ZeBu® Cloud. Rather than go through the costly, time-consuming process of building its own semiconductor emulation infrastructure, NeuReality took advantage of the flexibility, scalability, and elasticity of the cloud, and achieved tape-out and production of the world’s most complex chips on schedule. Read on to learn more about NeuReality’s quest to make AI adoption easy.
Everyone is talking about AI these days. Indeed, with the rise of ChatGPT and its generative AI cohorts, we are experiencing an infusion of intelligence in our everyday lives. According to MarketsandMarkets, the global AI market of US$150.2 billion in 2023 is anticipated to grow at a compounded annual growth rate (CAGR) of 36.8% from 2023 to 2030.
For NeuReality, democratizing AI adoption comes by making AI Inference systems the most efficient and affordable possible in the world’s data centers. By adopting AI-centric system architecture, governments and businesses of all sizes can achieve greater AI data processing speed, cost savings, and a lower AI carbon footprint. The company was founded in 2019 by a team of system engineers who wanted to make AI easy to deploy, manage, and scale – with the foresight that AI would soon explode while the right kind of system and silicon would take a minimum of three years to become reality. That moment is now. NeuReality launched its next-gen NR1™ AI Inference Solution with the world’s first NAPU™ (network addressable processing unit) and complete NeuReality software and APIs in late 2023.
“We provide a holistic silicon-to-software solution for AI inference,” said Yossi Kasus, co-founder and VP of VLSI at NeuReality. “Multiple types of compilers are typically required for the full flow of an efficient AI pipeline, along with a very complex set of AI accelerators on a device that handles voice, audio, images, and time series data. Our NR1 solution removes all the complexity of pre- and post-processing, so customers can get up and running very quickly. Our hardware runs multiple types of AI workloads and, as a network device, provides a simple way to scale.”
A key component of the NR1 NAPU is a multi-billion-gate 7nm chip, developed with the Synopsys Digital Design Family (which includes the Synopsys Fusion Compiler physical implementation solution), Synopsys Interface IP, and the Synopsys Verification Family. The two choices of data center products – the NR1-S™ AI Inference Appliance and NR1-M™ AI Inference Module – feature an AI-centric system architecture with complex IP, such as a hardware-based AI-Hypervisor™ and AI-over-Fabric™ network engine, developed alongside many other programmable compute engines to process high-variety, high-volume AI requests and responses. Lots of software and firmware run off these compute engines. The NR1-M is a full-height PCI Express card containing an NAPU and operates independently from the CPU while providing connectivity to deep learning accelerators (DLAs). Given these parameters, the team knew it would be very challenging to simulate the hardware together with the software and verify the full device.
Adopting a cloud-based solution for emulation made perfect sense for NeuReality. As Kasus explained, “When you build your own server infrastructure, you must find more budget for new servers whenever you need to scale. Scaling is easier on the cloud as your project needs fluctuate.”
After an evaluation of available solutions, NeuReality selected ZeBu Cloud, noting the effective level of field support they’d receive from Synopsys based on team members’ previous experiences. ZeBu Cloud provides flexible, secure, turnkey emulation to accelerate software bring-up, performance validation, power analysis, and system validation for IP and SoCs. ZeBu emulation capacity is hosted in the Synopsys data center. The NeuReality team started using ZeBu Server 4 before transitioning to ZeBu Server 5, the industry’s highest capacity emulator for up to 30 billion gates.
“When we wanted to shift to a stronger emulation machine, the flexibility of ZeBu Cloud helped us to easily do it and benefit from the speed and the number of gates supported,” said Kasus. “And the Synopsys support team can log in to easily see what is happening and work in parallel with us when we need the assistance.”
The ramp up was quick, taking only a couple of weeks. As they developed their 7nm chip, the team easily and quickly expanded the capacity of their emulation resources a few times. Without a cloud-based solution, it would have been difficult to meet an aggressive time-to-market target. “Scaling to more emulation servers on the cloud takes a couple of weeks, but had we needed to do this with on-prem resources, it would take months to get new machines in,” said Kasus. “Your needs change as you progress with a project, especially for a new device. You don’t exactly know how much capacity you’ll require, so having the flexibility as you progress is immensely helpful.”
With a comprehensive suite of virtual transactors available, using ZeBu Cloud also meant the team could use all the high-speed interfaces needed for its NR1 AI Inference Solution without any extra hardware. This allowed the team to build a complete system-level test environment for the NR1 NAPU to be emulated on the ZeBu system.
Having achieved successful tape-out and production, NeuReality’s 7nm chip is now in the bring-up phase and supporting applications including natural language processing, computer vision, speech recognition, and recommendation systems. Compared to the traditional CPU-centric architecture for AI inference, the NR1 NAPU has demonstrated the ability to deliver 10x the AI inference performance at the same cost – or 90% cost savings for the same performance. NR1 delivers affordable AI, making it accessible to all. The team is exploring process nodes and architectures for its next device, which will involve generative AI and large language model (LLM) acceleration, and plans to continue collaborating with Synopsys.
“We look at Synopsys as a strategic partner,” said Kasus. “As a startup, needs change as the company grows, so it’s important to have a partner that is flexible with your needs and helps you grow.”