CPUsGPUs

AMD Set to Announce Zen 3D Based Milan-X CPUs + Instinct MI200 GPUs: Watch the Live Event Here

AMD is all set to announce its next-generation of data center offerings in roughly 24 hours from now. We’re talking about the Zen 3D-based Milan-X processors, featuring 3D stacked V-Cache and the chiplet based Instinct MI200 GPU accelerators. Milan-X will retain the Zen 3 core and the N7 process from TSMC, and as such, can be thought of as a special refresh or niche stack, much like the upcoming Sapphire Rapids-SP with on-die HBM memory.

CPU Name Cores/Threads Base Clock Boost Clock L3 Cache (V-Cache + L3 Cache) L2 Cache TDP
AMD EPYC 7773X 64/128 2.2 GHz 3.5 GHz 512 + 256 MB 32 MB 280W
AMD EPYC 7573X 32/64 2.8 GHz 3.6 GHz 512 + 256 MB 32 MB 280W
AMD EPYC 7473X 24/48 2.8 GHz 3.7 GHz 512 + 256 MB 12 MB 240W
AMD EPYC 7373X 16/32 3.05 GHz 3.8 GHz 512 + 256 MB 8 MB 240W

Looking at the specs, everything’s basically identical to the vanilla Milan parts, including the base and boost clocks, the TDP as well as the L2 cache (other than the crapton of L3 cache). This means that performance gains (as already indicated earlier) will vary from application to application, and won’t be much pronounced in every workload.

You can watch the AMD Accelerated Data Center Keynote here

The exact specifications of the MI2150X have been shared. It’ll consist of a total of 110 CUs with a boost clock of 1.7GHz. This means that we’re likely looking at eight memory stacks, each featuring eight 2GB dies. This indicates a total bus width of 8,196-bits (1,024-bits x8 controllers), resulting in an overall bandwidth of 3.68 TB, roughly the same as the HBM variants of Sapphire Rapids-SP.

At the heart of the GPU core, there will be two 55 CU chiplets, resulting in an overall compute strength of 110 CU, with an impressive boost clock of 1.7GHz. Since Alderbaran can execute double-precision instructions (FP64) at native speeds, this will result in a double-precision throughput of 47.9 TFLOPs, an insane four times more than its predecessor, the MI100.

Even NVIDIA’s Ampere-based A100 Tensor core accelerator is capable of “only” 19.5 TFLOPs of FP64 compute. In terms of mixed-precision compute, we’re looking at 383 TFOPs of FP16 and BFLOAT16. In comparison, the MI100, topped out at “just” 184 and 92 TFLOPs in the two data types, respectively.

The MI250X will have a TDP of 500W which is a bit on the high side but is likely a result of the HBM memory. The MI250 should come will a lower boost clock and possibly lesser memory as well. A scalpel to the GPU core is unlikely but I wouldn’t rule it out.

The AMD Radeon Instinct MI200 GPU will, over the next year, begin to power three massive systems on three continents: the United States’ exascale Frontier system; the European Union’s pre-exascale LUMI system; and Australia’s petascale Setonix system.

Areej Syed

Processors, PC gaming, and the past. I have written about computer hardware for over seven years with over 5000 published articles. I started during engineering college and haven't stopped since. On the side, I play RPGs like Baldur's Gate, Dragon Age, Mass Effect, Divinity, and Fallout. Contact: areejs12@hardwaretimes.com.
Back to top button