Cloud native EDA tools & pre-optimized hardware platforms
AI models are doubling in complexity every 4 to 6 months, outpacing Moore’s Law by a factor of four, driving data center infrastructure to also rapidly evolve. Current hyperscale data center infrastructures are struggling to meet the speed and low latency needed to process and store trillion parameter models. New infrastructures require more storage capacity, enhanced computing resources, and faster interconnects. This is where PCIe 7.0, the latest iteration of the PCI Express standard at 0.5 of the spec, comes into play. PCIe 7.0 offers up to 512 GB/s of bandwidth and ultra-low latency, enabling interconnects to handle the massive parallel computing demands of AI workloads to help mitigate data bottlenecks.
Figure 1: AI clusters expanding over years to enhance the C2C connectivity to enable the computing, storage and bandwidth needed to process trillions of LLM parameters Taken from:
Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.
Modern AI workloads require a specialized architecture that integrates multiple accelerators working in conjunction with a central processor. Some of the most advanced architectures require up to 1,024 accelerators within a single computing unit. Because of this, the compute scale-up fabric needs the fastest interconnects to connect to hundreds of accelerators with high-throughput I/O networks in order to effectively train these AI models.
PCI-SIG announced PCIe 7.0 technology in 2022, with plans to release the full specification by 2025 (version 0.5 is currently available). This development aims to meet the substantial bandwidth demands of data-intensive applications and markets, including AI/ML, networking at 1.6T/800G Ethernet, HPC, and quantum Computing in HPC data centers. PCIe 7.0 will provide a low-latency, low-power, and reliable link between accelerators, processors, NICs, and other components, ensuring efficient connectivity for high-performance computing environments.
Figure 2: PCIe 7.0 will enable all key interconnects in the AI/ML Scale Up fabric with the bandwidth and secure data transfers needed to meet AI’s demands
PCIe 7.0 represents a significant advancement in hardware infrastructure for AI and HPC, offering several key benefits that cater to the demands of relentless innovation and unprecedented data sets:
Card Electromechanical (CEM) connectors, developed by PCI-SIG and first introduced in 2000, are crucial for connecting motherboards with add-in cards (AICs) and riser cards. They support various modules like SSDs for storage, GPUs for graphics, NICs for network connectivity, and ML/DL or hybrid computing modules. For PCIe 7.0 CEM connectors, the focus is on mitigating reflection and crosstalk, ensuring low cable loss, clean conductor terminations, and minimizing skew and periodic resonance. PCIe 7.0 connectors and cables have strict signal integrity requirements, with new metrics like Return Loss excursion being discussed to improve signal quality and reliability at higher speeds.
Additionally, the formation of the PCIe Optical Workgroup by PCI-SIG indicates a move beyond the limitations of copper signaling, particularly with CopprLink External Cable, to embrace optical solutions. Optical cabling was recently introduced to PCI-SIG, generating excitement about extending the physical reach of compute networks. This technology offers advantages like lower latency and enhanced thermal management capabilities.
The dual focus on optical PCIe links includes adapting logical communication schemes at the protocol Layer while introducing new form factors with better thermal management and optimized optical links at the physical Layer. These advancements aim to meet the growing demands for speed, reliability, and efficiency in high-performance computing and networking. The transition to the PCIe standard at 128Gbps marks a significant evolution in chip design, promising expanded capabilities, cache coherence, and new design challenges, including:
While standard is still in flux, Synopsys recently announced the world’s first complete IP solution for PCIe 7.0, including Controller, IDE Security Module, PHY, Controller and Verification IP. This solution paves the way forward to enable ecosystem connectivity in embracing this lightning speeds.
At DesignCon 2024, Synopsys showcased wide open 128 Gbps TX PAM4 eyes with excellent RLM. The TX to RX loopback ran at 128 Gbps over a long-reach channel, demonstrated the robustness of the IP with a pre-FEC BER multiple order of magnitude better than the spec.
To continue highlighting this technology, we also showcased at PCISIG DevCon 2024 PCIe 7.0 , including TX and RX performance in a loopback configuration, the industry’s first PCIe 7.0 interops with electrical cable channels like DAC, backplane channels as well as directly driving and equalizing optical impairments. Additionally, we showcased the world’s first PCIe 7.0 controller demo with a successful root complex to endpoint connection showing FLIT transfers using EQ bypass mode.
PCIe 7.0 enables designers to address the escalating demands of AI and HPC environments, providing higher bandwidth, lower latency, improved energy efficiency, and compatibility with existing infrastructure. System designers need to achieve much needed and desired improvements in data throughput aiding advances in the deployment of artificial intelligence inference engines and co-processors topologies in the data center. This requires new techniques in simulation as well as post silicon validation. Innovative simulation, design, test and measurement methodologies are required for these PAM-4 inflection point. The correlation between simulation and validation, design practices for PCIe over optical cables and through electrical cables, signal integrity complication leads to noise reduction, techniques to maintain signal integrity and minimize issues like reflection and crosstalk.
The move towards PCIe at 128Gbps represents a paradigm shift in high-speed interconnect technology. It introduces new challenges and opportunities in IP design aimed at enhancing performance, efficiency, and reliability in modern computing and networking environments. Synopsys is at the forefront of this technology revolution with industry's first complete pre-verified PCIe 7.0 IP solution. The standards-based solution, consisting of PHY, controller, IDE security module, and verification IP, provides secure data transfers up to 512 GB/s bidirectional in a x16 configuration to mitigate data bottlenecks. With over two decades of PCI Express experience, Synopsys offers designers an early start for next generation HPC and AI SoCs to accelerate the path to production.
In-depth technical articles, white papers, videos, webinars, product announcements and more.
Read More →