Kicking off a busy day of product announcements and updates for AMD’s data center business group, this morning AMD is finally announcing their long-awaited high density “Bergamo” server CPUs. Based on AMD’s density-optimized Zen 4c architecture, the new EPYC 97x4 chips offer up to 128 CPU cores, 32 more cores than AMD’s current-generation flagship EPYC 9004 “Genoa” chips. According to AMD, the new EPYC processors are shipping now, though we’re still awaiting further details about practical availability.

AMD first teased Bergamo and the Zen 4c architecture over 18 months ago, outlining their plans to deliver a higher density EPYC CPU designed particularly for the cloud computing market. The Zen 4c cores would use the same ISA as AMD’s regular Zen 4 architecture – making both sets of architectures fully ISA compatible – but it would offer that functionality in a denser design. Ultimately, whereas AMD’s mainline Zen 4 EPYC chips are designed to hit a balance between performance and density, these Zen 4c EPYC chips are purely about density, boosting the total number of CPU cores available for a market that is looking to maximize the number of vCPUs they can run on top of a single, physical CPU.

While we’re awaiting additional details on the Zen 4c architecture itself, at this point we do know that AMD has taken several steps to boost their CPU core density. This includes redesigning the architectural layout to favor density over clockspeeds – high clockspeed circuits are a trade-off with density, and vice versa – as well as cutting down the amount of cache per CPU core. AMD has also outright stuffed more CPU cores within an individual Core Complex Die (CCD); whereas Zen 4 is 8 cores per CCD, Zen 4c goes to 16 cores per CCD. Which, amusingly, means that the Zen 4c EPYC chips have fewer CCDs overall than their original Zen 4 counterparts.

AMD EPYC 97x4 Bergamo Processors
AnandTech Core/
Thread
Base
Freq
1T
Freq
L3
Cache
PCIe Memory TDP
(W)
Price
(1KU)
9754 128 256 2250 3100 256MB 128 x 5.0 12 x DDR5-4800 360 $11,900
9754S 128 128 2250 3100 256MB 128 x 5.0 12 x DDR5-4800 360 $10,200
9734 112 224 2200 3000 256MB 128 x 5.0 12 x DDR5-4800 320 $9,600

Despite these density-focused improvements, Bergamo is still a hefty chip overall in regards to the total number of transistors in use. A fully kitted out chip is comprised of 82B transistors, down from roughly 90B transistors in a full Genoa chip. Which, accounting for the larger number of CPU cores available with Bergamo, works out to a single Zen 4c core being about 68% of the transistor count as a Zen 4 core, when amortized over the entire transistor count of the chip. In reality, the savings at the CPU core level alone are likely not as great, but it goes to show how many transistors AMD has been able to save by cutting down on everything that isn’t a CPU core.

Meanwhile, as these are a subset of the EPYC 9004 series, the 97x4 EPYC chips are socket compatible with the rest of the 9004 family, using the same SP5 socket. A BIOS update will be required to use the chips, of course, but server vendors will be able to pop them into existing designs.

As noted earlier, the primary market for the EPUC 97x4 family is the cloud computing market – the ‘c’ in Zen 4c even stands for “cloud”, according to AMD. The higher core counts and less aggressive clockspeeds make the resulting chips, on a core-for-core basis, more energy efficient than Genoa designs. Which for AMD’s target market is a huge consideration, given that power is one of their greatest ongoing costs. As part of today’s presentation, AMD is touting a 2.7x improvement in energy efficiency, though we’re unclear over what that figure is in comparison to.

With their higher core density and enhanced energy efficiency, AMD is especially looking to compete with Arm-based rivals in this space, with Ampere, Amazon, and others using Arm architecture cores to fit 128 (or more) cores into a single chip. AMD will also eventually be fending off Intel in this space, though not until Sierra Forest in 2024.  

Pure compute users will also want to keep an eye on these new Bergamo chips, as the high core count changes the current performance calculus a bit. In regards to pure compute throughput, on paper the new chips offer even more performance than 96 core Genoa chips, as the extra 32 CPU cores more than offsets the clockspeed losses. With that said, the cache and other supporting hardware of a server CPU boost performance in other ways, so the performance calculus is rarely so simple for real-world workloads. Still, if you just need to let rip a lot of semi-independent threads, then Bergamo may offer some surprises.

We’ll have more on more on the EPYC 97x4 series chips in the coming days and weeks, including more on the Zen 4c core architecture, as AMD releases more information on that. So until then, stay tuned.

Comments Locked

14 Comments

View All Comments

  • Slash3 - Tuesday, June 13, 2023 - link

    These also revert to a dual-CCX (per CCD) design, which is an odd choice. Two eight core CCXes per chiplet.
  • kpb321 - Tuesday, June 13, 2023 - link

    That seems consistent with favoring density over performance. An IF link per CCX is going to have more fabric overhead than 1 IF link per 2 CCX but will potentially cost some performance if the IF link is a bottleneck for your workload with that many cores sharing.
  • trevdawg94 - Tuesday, June 13, 2023 - link

    The existing EPYC IO die only supports up to 12 IF links. The only way to add more would've been to make a new IO die which would've increased fab cost and added extra design complexity. I wouldn't be surprised if AMD would make an IO die with more IF links for Zen 5 EPYC CPUs if demand for these CPUs is high enough. Even if they don't use all of the IF links for the standard EPYC CPUs, using the same IO die for both would help keep fab and design costs in check.
  • SanX - Tuesday, June 13, 2023 - link

    What fab and design cost are specifically?
  • TekCheck - Tuesday, June 13, 2023 - link

    They have to advance I/O die for several reasons. Next gen Bergamo cannot add another two 16 core chiplets on current package. If they plan Turin dense with 192 c cores, new package and I/O die are needed on the same 6096 socket.
  • TekCheck - Tuesday, June 13, 2023 - link

    Nothing unusual. 2x4 CCX was already used in early Ryzen chiplets. This is an iterative step-up. Zen5c will have unified 16 core CCX. They move by gradually perfecting the logic.
  • nandnandnand - Tuesday, June 13, 2023 - link

    Is there any information out there about a unified 16-core CCX or is that speculation? There might be scaling issues that make 2x8 the way to go for multiple generations.
  • whatthe123 - Wednesday, June 14, 2023 - link

    i feel like a unified 16 will have serious routing problems or latency issues similar to the groups of E cores on intel chips. intel and amd have stuck with 8 on P cores or meshes to deal with the mess of routing, so if you want 16 cores AND fast core to core communication you have to get creative in 2.5 or 3D space, which I doubt will happen by next generation considering the struggles even with 2.5 designs.
  • cbm80 - Tuesday, June 13, 2023 - link

    The 8B lower transistor count is accounted for by the 128MB less L3 cache.
  • Silver5urfer - Wednesday, June 14, 2023 - link

    Contrary to the Intel x86 E cores these have SMT also enabled too, they so day its the same IPC at Clock rate but perhaps these are lower clocked thus higher density than Zen 4.

    I really hope this does not make it to the AM5 socket as some sort of hybrid abomination like Intel and ARM BS. Plus if AMD wants to create a hybrid solution on a Chiplet based design they have a lot of hoops 16Cores 32Threads per CCD and an IODie able to handle 4c and 4 on top of the whole MCM design will be a massive PITA. Plus AMD never said AM5 / Desktop Ryzen will be getting the c variant, and moreover the name signifies as Cloud for Zen 4c. Just imagine the complexity since the X3D 7900X3D and 7950X3D rely on OS scheduler this complication will make matters worse, so that's the best news I can think of - No Hybrid nonsensical BS on the AM5 socket. They'll dump this onto maybe Next Console revision as a subset of cores for the OS and Base subsystems over the main performance cores since Consoles are BGA and monolithic. Same for the BGA laptop and other disposable parts. Plus unlike Intel these will also have AVX512 execution blocks too.

    All in all very impressive to be able to cut the core down and get same IPC. Super high density too on top unlike ARM with no SMT.

    I wish to see the day when AMD releases 4-way SMT on their Zen x86 design processors it would be mind blowing for sure.

Log in

Don't have an account? Sign up now