
AMD announced its Ryzen AI 300 “Strix Point” processors alongside the Ryzen 9000 desktop lineup at Computex earlier this month. The former will be the first hybrid core design from Team Red (excluding Phoenix 2), expected to launch this fall. Barring the basic specifications and first-party benchmarks, AMD hasn’t shared a lot of info on these chips.
AMD Ryzen AI 300 Specifications

The Ryzen AI HX 370 will feature 12 cores and 24 threads (based on Zen 5) with a boost clock of up to 5.1 GHz. It is backed by 36 MB of L3 cache and a Radeon 890M GPU which consists of 16 CUs (1024 shaders) clocked at 2.9 GHz. The SKU has a TDP of 28W, which can be increased or decreased to 54W or 15W, respectively.
The Ryzen AI 365 reduces the core count to 10 (20 threads) clocked at a peak boost clock of 5 GHz with 34 MB of L3 cache. It leverages the Radeon 880M iGPU comprising 12 CUs (768 shaders) with a maximum core clock of 2.9 GHz. Both SKUs include a 50 TOP NPU based on the XDNA 3 architecture.

Preliminary testing on behalf of David Huang has revealed cut-down Zen 5 “P-cores” and low-power Zen 5c “E-cores” with a peak boost clock of 4 GHz. Only the Ryzen AI 9 365 could be tested, which features 4x Zen 5 “P-cores” and 6x Zen 5c “E-cores” on a monolithic die divided into two CCXs (CCD vs CCX explained).
Ryzen AI 9 365: Strix Point “Zen 5” Instruction Throughput
Like previous Ryzen mobile APUs, the Zen 5 CCX has 16 MB of L3 cache (vs. 32 MB on desktop and Epyc) and a reduced boost clock. AMD has also cut the SIMD throughput (execution backend) in half with a reduced L1 load bandwidth. The Zen 5c CCX consists of 8 MB of L3 cache with the same ISA as Zen 5, though limited to 4 GHz.

The Zen 5 core on Strix Point offers significant throughput gains in certain instructions, but due to the reduced execution bandwidth, it mostly delivers similar performance to Zen 4. The store bandwidth is doubled with 128-bit and 256-bit instructions, but the load throughput is unchanged.
The branch prediction has gotten healthy upgrades with wider BTBs, increasing the number of not-taken branches from 2 to 3, and taken-branches from 1 to 2. Unfortunately, the integer addition (128/256-bit) throughput is half as much as Zen 4 (likely due to the slimmer execution units). All SSE/AVX/AVX512 additions also take 2 cycles, up from a single cycle on Zen 4.



AMD Zen 5 Dual Front-end: 8-wide Decoder?
The front-end is, by far, the most interesting aspect of Zen 5. The Ryzen AI 9 365 cores use a dual front-end, much like Bulldozer and Intel’s Crestmont/Tremont E-cores. It features 2x 4-wide decoders that can run independently. Each decoder can fetch instructions from two locations, allowing two taken-branches per cycle. In SMT2 mode, one thread can use one decoder each, increasing the throughput to 8.
On the flip side, Zen 5 is limited to 4-wide decode when processing continuous NOP instructions on the same thread. It also seems to leverage a smaller op-cache of 4K entries versus 6.75K on Zen 4. Furthermore, while the L1 to L2 and L1 to FP bandwidth are doubled, the DRAM to L3 bandwidth is reduced by 50%. However, single-threaded L3 read bandwidth is 32 bytes per cycle, up from 24 bytes on Zen 4, while the latency has been reduced from 50 cycles to 46 cycles on Zen 5.

Despite the use of the upgraded ladder L3 interconnect, the core-to-core latency on Strix Point is the same as Phoenix and Hawk Point. Depending on the FCLK and MCLK, this may change on desktop CPUs.

In the SPEC CPU benchmark, Zen 5 shows an average IPC uplift of ~10% over Zen 4 with all cores locked at 4.8 GHz. Geekbench 5 and 6 show a 15-17% improvement in IPC, but neither is a good representation of sustained compute-intensive or content-creation workloads. Overall, Strix Point looks like a decent upgrade over Phoenix/Hawk Point, but the cuts to the SIMD units may prove to be expensive.