Ever since the introduction of its RTX branding with the RTX 20 series “Turing” lineup, NVIDIA has been doubling down on ray-tracing (and DLSS) with each passing generation. The 1st Gen RT-cores on Turing were more or less hogwash, with barely a handful of titles supporting ray-tracing through the product cycle. The 2nd Gen RT-cores on Ampere made more sense. As more and more games adopted RTGI and DLSS, NVIDIA’s investments started bearing fruit. At present, over 150 games support DLSS/RT or both, including Dying Light 2 Stay Human, Sifu, Shadow Warrior 3, SCP: Pandemic, Phantasy Star Online 2 New Genesis, and Supraland Six Inches Under.
Unsurprisingly, NVIDIA is expected to continue this strategy with the RTX 40 “Ada Lovelace” lineup. The performance targets for ray-tracing are presently slated to be 20-25% higher than rasterization [unconfirmed]. Going by this, the GeForce RTX 4080 should be at least 2x faster than the RTX 3090 in ray-tracing workloads while offering an 80-100% gain in pure raster workloads.
The AD103 die powering the GeForce RTX 4070/4060 Ti is set to be 10-30% faster than the RTX 3090 in traditional raster performance and 40-50% faster in ray-tracing workloads like RTGI, RTR, RTAO, RTS, etc. As for the RTX 4060 (AD104), we’re looking at RTX 3080 levels of generalized raster performance and impressively, RTX 3090 Ti levels of performance in ray-traced titles.
GPU | GA102 | AD102 | RTX 4090 | AD103 | RTX 4080 | RTX 4070 Ti (AD104) | RTX 4070 |
---|---|---|---|---|---|---|---|
Arch | Ampere | Ada Lovelace | Ada Lovelace | Ada Lovelace | |||
Process | Sam 8nm LPP | TSMC 5nm | TSMC 5nm | TSMC 5nm | |||
GPC | 7 | 12 | 11 | 7 | 7 | 5 | 5 |
TPC | 42 | 72 | 64 | 42 | 40 | 30 | 30 |
SMs | 84 | 144 | 128 | 84 | 80 | 60 | 60 |
Shaders | 10,752 | 18,432 | 16,384 | 10,752 | 9,728 | 7,680 | 7,680 |
TP | 37.6 | ~100 TFLOPs? | 83 TFLOPs | ~50 TFLOPs | 47 TFLOPs? | ~35 TFLOPs | 35 TFLOPs? |
Memory | 24GB GDDR6X | 48GB GDDR6X | 24GB GDDR6X | 16GB GDDR6X | 12GB GDDR6X | ||
L2 Cache | 6MB | 96MB | 72MB | 64MB | 48MB | ||
Bus Width | 384-bit | 384-bit | 256-bit | 192-bit | |||
TGP | 350W | 600W | 450W | 450W | 285-340W | 300W | 285W |
Launch | Sep 2020 | Sept 22? | Sept 22? | Q1 2023? |
It’s important to note that as these figures are based on early numbers and guesses, they might a bit off, but we don’t expect a large deviation. NVIDIA upgraded its RTCores quite significantly going from Turing to Ampere, and a similar overhaul is expected from Ada. Higher boost clocks, an advanced process node, and higher TDPs should allow the chipmaker to maximize gains on various fronts.