Compute Engine Strategies In The Age Of GenAI

SPONSORED FEATURE  While generative AI and GPU acceleration of AI training and inference have taken the world by storm, the datacenters of the world still have to think about CPUs – and think very carefully about them at that.

For one thing, at most companies, there are hundreds to thousands of back office workloads, sometimes backed by relational databases, that are chugging along running the business. Maintaining this fleet of machinery is important, without question.

But modernizing that fleet is also a way to help pay for the enormous investments that will need to be made in the coming years in AI infrastructure, whether companies buy trained models or create them. No matter what, GenAI is going to be an expensive proposition, and removing costs from the general purpose server fleet is going to be instrumental in lowering the power and cooling costs for this legacy portion of the server fleet while at the same time improving its performance.

In addition, a modern processor with lots of cores and lots of I/O and memory bandwidth, is also a way to get better return on investment from those expensive AI server fleets. A fast CPU like the “Turin” 5th generation AMD Epyc 9575F can boost the performance of AI inference by as much as 8 percent and AI training by as much as 20 percent according to benchmarks done by AMD on server nodes with eight GPUs each. Considering the high costs of GPUs, this performance boost covers the incremental cost of buying a faster CPU in the AI host machine many, many times over.

And finally, there will be many cases where it makes sense to run AI algorithms on the CPUs themselves, which are plenty capable these days of doing the vector math needed for AI inference and lightweight AI training. So, again, a high performance CPU is important to have even in the general purpose server fleet.

We talked about these issues with Madhu Rangarajan, corporate vice president of product management, planning, and marketing at the Server Solutions Group at AMD, and we also touched upon the idea that more than ever, now is the time to start thinking about deploying single-socket servers in your datacenter and get away from the traditional two-socket server thinking.

To learn more about AMD’s strategy for updating the server fleets in your datacenter, check out the video above.

 

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

1 Comment

  1. Cool interview! We’ve known since last Thursday’s analysis that AMD currently leads in CPU and CPU+GPU FP64 $/Tflops, but it was nice to be reminded here (near 11:50) that it is also very good in power efficiency (HPL #1 El Capitan with 1.7 EF/s is #18 in Green500 with 59 GF/W — not that far off Alps’ 61 GF/W for a lower 0.43 EF/s). Power efficiency was the big story when Frontier broke through the Exaflop, if I remember correctly, especially compared to Fugaku.

    Good also to hear Madhu’s perspective on when to switch from pure CPU installs to CPU+GPU as a function of workload types (near 6:45), and that the idea behind the Epyc 9575F is to have that 5 GHz max clock to efficiently feed GPUs (near 2:55). A very informative exchange!

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.