Samsung is set to demo its next-gen GDDR7 memory at IEEE 2024 (International Solid-State Circuit Conference). Based on PAM3, it is a step up from NRZ and PAM4 used by GDDR6 and GDDR6X memory, respectively. We’re likely looking at the memory standard set to power the GeForce RTX 5090 later this year. At 23Gbps, the GeForce RTX 4080 Super sports the fastest graphics memory on the market. The GDDR6X memory from Micron leverages PAM4 modulation capable of transmitting twice as much data.
While NRZ has just two data states (0 and 1), PAM4 has four (00, 01, 10, and 11) and PAM3 has three (1, 0, and -1). Samsung’s GDDR7 memory leverages “PAM3-Optimized TRX Equalization and ZQ Calibration.” It transfers 1.5 bits per cycle, compared to 2 bits on PAM4 and 1 bit on PAM2 or NRZ. PAM3 has a lower bandwidth than PAM4 at the same frequency but higher than PAM2/NRZ. Its primary advantage over PAM4 is a lower Signal-to-noise Ratio (SNR), leading to a better Bit Error Rate (BER). A lower power output is also an added benefit.
In addition to higher transfer rates per cycle, GDDR7 issues two independent commands, optimizing data throughput. This is achieved on a bank level. For example, if Bank A is being refreshed, then Bank B can be read simultaneously. Power efficiency can be further improved by switching to NRZ mode when idle.
Chip type | Memory clock | Transfers/s |
---|---|---|
GDDR3 | 625 MHz | 2.5 GT/s |
GDDR4 | 275 MHz | 2.2 GT/s |
GDDR5 | 625–1125 MHz | 5–9 GT/s |
GDDR5X | 625–875 MHz | 10–12 GT/s |
GDDR6 | 875–1125 MHz | 14–20 GT/s |
GDDR6X | 594–656 MHz | 19–23 GT/s |
GDDR7 | ? | ?―37 GT/s |
Samsung’s talk mentions 16Gb 37Gbps GDDR7 memory, 61% faster than the 23Gbps GDDR6X memory on the GeForce RTX 4080. A generation uplift of this magnitude for the RTX 5090 is unlikely. I’d predict a more conservative 32Gbps or perhaps an even lower 24Gbps for the first wave of Blackwell GPUs coming later this year.
According to @kopite7kimi, the next-gen Blackwell “GB202” gaming die will have a 12 (GPCs) x 8 (TPCs) configuration. This would result in a massive GPU featuring 12 GPCs (Graphics Processing Clusters), each with 8 TPCs (Texture Processing Clusters). A TPC comprises 2 SMs (Streaming Multi-Processors), each of which contains 128 FP32 cores.
NVIDIA RTX 5090 Specs: Read more…
The above numbers net a core count of 24,576 (192 SMs) for the fully enabled GB202 die. Like its predecessor, the GeForce RTX 5090 should contain a few disabled SMs to ensure proper yields. Optimistically, we can expect a core count of 24,046 or less. On the memory side, 32GB of GDDR7 memory clocked at 24Gbps sounds like a reasonable bet.
Via: TechRadar