Intel’s engineers have been working tirelessly to make the Arc graphics driver presentable to the keen eye senses of modern PC gamers. With each new update, we’ve seen double-digit improvements in in-game performance for existing and newly launched titles. The latest Windows Arc driver boosts frame rates by as much as 33% in F1 2023, placing it on par with the RTX 4060 (speculation).
The Linux Mesa driver is getting equal attention. The last update (released on Friday) optimizes gaming performance in some of the most popular titles, including Counter-Strike: Global Offensive and Shadow of the Tomb Raider. The former is up to 11% faster, while the latter is 5.5% faster on the Arc Alchemist GPUs.
In addition to optimizing gaming performance, support for the upcoming 14th Gen Meteor Lake processors has also been added. Meteor Lake-P is said to leverage the same microarchitecture as Alchemist, possibly without the XMX matrix and RT units.
This enables L3 partial write merging for a number of cases that seem to be getting accidentally disabled by the kernel, which was causing a serious performance bottleneck on DG2 and MTL platforms. The “Compressible Partial Write Merge Enable”, “Coherent Partial Write Merge Enable” and “Cross-Tile Partial Write Merge Enable” bits in L3SQCREG5 were expected to be enabled by default (and confusingly, they even read off as enabled if you ran ‘intel_reg read 0xb158’ on an idle system), but they are getting clobbered during 3D context initialization by an i915 workaround.
Merge request
Enabling L3 partial write merging of compressible surfaces in particular seems to increase rendering fillrate by over 3x in some cases (e.g. the “VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0” fillrate-bound microbenchmarks). Significant improvements can also be reproduced in most real-world workloads we’ve tested so far, e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 — Thanks a lot to Caleb Callaway for these figures. No regressions have been observed so far.
Even though this patch might strike as surprisingly simple for such a large payoff, it’s the result of @fjdegroo and I trying to root-cause the rendering performance gap of DG2 on Linux vs Windows on and off during the last year, and some of the OA statistics captured by Felix early this month were greatly helpful for me to connect the last few dots, so Felix deserves a big chunk of the credit for this work.