Comments on: The Year Ahead In Datacenter Compute

By: emerth

Sun, 09 Jan 2022 23:49:07 +0000

By: Paul Berry

Sat, 08 Jan 2022 01:32:43 +0000

By: Timothy Prickett Morgan

Fri, 07 Jan 2022 16:04:43 +0000

By: Timothy Prickett Morgan

Fri, 07 Jan 2022 15:58:04 +0000

By: Timothy Prickett Morgan

Fri, 07 Jan 2022 15:48:51 +0000

In reply to Ziple.

You assume that the MI210 is not a double whammy, but I do not. I think the PCI-Express version will be almost the same performance as the MI250, as I showed here: https://www.nextplatform.com/2021/11/09/the-aldebaran-amd-gpu-that-won-exascale/

Slower clock, smaller memory, lower memory bandwidth so those who don’t want to use OAM can get a reasonable GPU from AMD for their systems. If I am right, then there is room for something that is a half or a little more than half the MI210 and still a lot better than the MI100.

]]>

By: Ziple

Thu, 06 Jan 2022 20:01:10 +0000

The mi200 cutdown you are talking about was already announced it is the mi210.

]]>

By: Matt

Thu, 06 Jan 2022 20:00:02 +0000

Please don’t use the old Intel naming terminology (“10 nm”). It will only confuse things more. Let’s have some consistency. Intel has changed its name to “Intel 7” and the Alder Lake CPUs suggests that it really is best compared with TSMC’s “7 nm” process.

I’ll be surprised if AMD’s Zen4 server chips are available in any volume at all soon after the launch of Intel’s Sapphire Rapids.

Regarding the NVIDIA GPU, I’m skeptical NVIDIA will come out with a 5 nm GPU in any volume in 2022. If they do it seems like a bit of a change from how early they produce on a node, historically. Perhaps if it’s a late 2022 product. For the A100 they already had lots of them floating around when they formally launched it at GTC in May 2020. If they use 5 nm for the successor and announce that at GTC 2022 in March I doubt many will be floating around until much later in 2022. I doubt NVIDIA will focus too much on FP64 in their new part unless they can use those transistors for lower precision compute as well. As the years go on the importance of lower precision compute as compared to FP64 compute is becoming greater and greater for NVIDIA’s data center business. AMD’s GPU business, in contrast, consists mostly of supercomputing facilities that rely a lot on FP64. I doubt NVIDIA will sacrifice performance in 95%+ of their revenue stream in order to defend the 5% from AMD. If expanding the FP64 comes at the cost of lower-precision performance NVIDIA is likely to either bifurcate their product line, as Intel originally planned to do with their HP and HPC lines, or they will allow AMD to have an advantage in supercomputing. I think they should have the money to do the former. Even if they don’t make much money on them, those supercomputers are a high-visibility “halo” market. Even though NVIDIA hates to do such a thing, if they can’t repurpose extra FP64 transistors efficiently, it might be worth it to bifurcate their product line to compete better with AMD and Intel in supercomputing.

Regarding supercomputing, it seems reports are now for Frontier to have early access pushed back to June 2022 with full user access pushed to Jan. 1, 2023. This despite assurances otherwise in October of 2021 when Aurora was (again) being pushed back.: https://executivegov.com/2021/12/installation-of-supercomputer-frontier-at-oak-ridge-national-lab-now-underway/ So Frontier seems to arrive not that much before Aurora, any more.
If Aurora really does hit 2.43 exaflops peak it will have a peak efficiency of about 24 MW / exaflops whereas the Frontier machine will have a peak efficiency of about 19 MW / exaflops. Intel is promising 45 FLOPs of FP64 vector performance per GPU. That means, with 54,000 GPUs in Aurora, we have 2.43 exflops of peak FP64 vector performance. So it checks out. Suppose the 650 Watts rumor is true. AMD, on their web site, is promising 45.3 TFLOPs of FP64 vector at 560 W peak. So the Intel GPU uses 16% more power for the same performance as the AMD GPU. If we instead use the 500 W number on AMD’s site we get the Intel GPU using 30% power of the AMD GPU. Frontier’s 19 MW/exaflops plus 16% is 22 MW/exaflops and if instead we had 30% to it we end up with 24.7 MW/exaflops. Aurora’s power usage should be at most around 24 MW/exaflops, putting it in that range. So that all checks out. So all indications are that, from a peak theoretical standpoint, Ponte Vecchio seems to use 15% to 30% more power for the same performance as Aldebaran. It’s not that much more power hungry. However, supercomputers using the A100 seem to use the A100’s FP64 matrix operations for their linpack numbers. So I would have expected that Frontier would do the same, but it seems that they are not because 1.5 exaflops divided by 36,000 GPUs is about 42 TFLOPs per GPU, around what Aldebaran can get from FP64 vector and half what it gets from FP64 matrix. However there could be something else going on there. I have to say somehow these accelerators and supercomputers seem to be shrouded in much more mystery and intrigue than is normal. Between the large number of execution units, the various power consumption figures quoted, vector instructions versus matrix instructions, and strange transistor to die size quotes it isn’t easy to determine how best to compare the accelerators. We will have to wait for some real world experience.

Finally, I have to say this has seemingly become a very AMD-Gung Ho site whereas it used to have a lot of faith in Intel.

]]>

By: Paul Berry

Thu, 06 Jan 2022 19:23:19 +0000

“late 1980s… (Those were the days)”
An interesting time, full of diverse innovation, but also kind of a great British baking show kind of innovation.
‘Given a limit of 150,000 transisters, and 200 I/O pins, how fast can you make a processor run? -Oh, and get useful work done with 4MB of ram, and open source software doesn’t exist, so your operating system needs to be written by fewer than 20 people, or be based on At&t.’ So, those were the days for tech journalists (and also monetizing a print magazine vs web journalism), but computers really sucked. The pace of innovation, and diversity of ideas may have fallen off, but mostly because we have an embarrassment of riches: Transisters are basically free if you can figure out how to connect them. Ram capacity is free in order to get the bandwidth. Embedded graphics will drive half a dozen HD displays with real-time 3d graphics. Free operating systems include robust networking, encryption, security, dozens of programming languages, built in compilers. I kind of agree with the love of workstation and datacenter systems of that period, but all that innovation was to make very expensive, bespoke systems to do even a couple very basic tasks that are now all taken for granted.

]]>

By: Adriano

Thu, 06 Jan 2022 17:32:29 +0000

Happy New Year!
While this is mainstream, no comment on Qualcomm, SiPearl or processors made in China, South Korea or Japan?

]]>