Comments on: HBM Gives Xeon SPs A Big Boost On Bandwidth Bound Work https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Tue, 13 Dec 2022 15:43:42 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Paul Berry https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/#comment-200922 Thu, 17 Nov 2022 14:21:32 +0000 https://www.nextplatform.com/?p=141519#comment-200922 In reply to MH.

Except I don’t know how many customers are going to skip ddr5. I think most will chose to buy both HBM and DDR5. You can make a handful of applications super fast, but there are a ton of applications that require more memory per rank. If you want to make your machine work for all the applications you have to do both. I suppose you might see a large mostly-ddr machine with a small partition of HBM nodes, much like most HPC machines have a small GPU partition. It’s really hard to buy a machine that works well for all applications, and expensive to make every node the best of the best.

]]>
By: JayN https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/#comment-200910 Thu, 17 Nov 2022 03:14:23 +0000 https://www.nextplatform.com/?p=141519#comment-200910 the phoronix site had a couple of articles on SPR’s new user interrupt feature, bypassing kernel for interrupting other cores. They claim “9x faster for lower IPC communication latency”. Were there any comments at sc22 on what this feature contributes to hpc/ai performance?

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/#comment-200897 Wed, 16 Nov 2022 20:51:35 +0000 https://www.nextplatform.com/?p=141519#comment-200897 In reply to MH.

Totally agree. 100 MHz is a rounding error. Which was the point I was trying to make. Hey, it was late at night when I wrote this… HA!

]]>
By: MH https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/#comment-200896 Wed, 16 Nov 2022 20:47:31 +0000 https://www.nextplatform.com/?p=141519#comment-200896 > Then to get a 1.6X delta seen with the High Performance Linpack (HPL) test shown below, the clock speeds have to drop from 2.6 GHz on the Ice Lake Xeon SP to 2.5 GHz on the Sapphire Rapids Xeon SP.

Well, since the benchmark benefits from HBM, it’s not so much that the clock speed has dropped but that the system-level performance of these non-HBM examples as a ceiling due to DRAM bandwidth.

Regarding cost – that will be very interesting to see as HBM parts are not cheap, and 4 of them in the processor package will have a significant effect on the CPU cost – quite likely the 2x mentioned in the article?

But then analyzing this at the system level, the in-package HBM makes it unnecessary to have 8 memory channels per socket that are populated with server-grade memory. That’s a significant saving, and on top of that there is a board area and complexity saving, and likely also a power and cooling saving (no need to drive “long” on-board wires with their terminating resistors).

And as board area is saved, compute density goes up – so there could be some kind of virtuous spiral that comes into effect, as long as all of the active data can fit in HBM.

]]>
By: HuMo https://www.nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost/#comment-200829 Tue, 15 Nov 2022 14:52:15 +0000 https://www.nextplatform.com/?p=141519#comment-200829 Great feeds and speeds! The HPCG results in particular, seeing how Fugaku (A64FX with HBM2) has roughly 1/3 of the performance of Frontier on HPL (EPYC+MI250x), but beats it by a tad on HPCG (16 vs 14 PFlop/s) — and here, similarly, we see Intel Max-with-HBM run HPCG 3x faster than EPYC-without-HBM, while HPL runs at about the same speed on both. As you suggest: “AMD has no intention of adding HBM […] could change its mind” — I totally agree! HBM really helps the memory access kung-fu needed by HPCG, while the dense matrix karate of HPL is already taken care of by vector/matrix units.

]]>