Next Stop: the Uncore

Continuing with our review of Haswell architecture, let's again take a step back and use the Xeon 5500 as our reference point. The Xeon 5500 is based on the "Nehalem" architecture, and it helped Intel become dominant in the server market. Before the Xeon 5500, AMD's Opteron was still able to outperform the Xeons in quite a few applications (HPC and virtualization for example), even by significant margins. That changed with Nehalem, so the Xeon 5500 is a good reference point.

7-zip Benchmark – Single Threaded

The 27% cumultative IPC (integer only) improvement of Haswell mentioned is more than just theory: Anand's review of the desktop Haswell CPUs confirmed this. The Haswell Core i7-4770k at the same clock speed is about 21% faster than Nehalem. Now that is below the promised 27% performance increase, but 7-zip is among the applications known to have very low IPC.

Let's go back to the server world. Instead of increasing the clock speeds, clock speeds have declined from 2.93-3.2GHz (Xeon 5500) to 2.3-2.6GHz for the latest high-end parts. However, when Turbo Boost is enabled, 2.8 – 3.1GHz is possible with all cores active. So the clock speed of the high end server CPUs is actually 5 to 20% lower and not 10% higher as in the desktop space. The gains Intel has made in IPC are thus partly negated by slightly lower clock speeds.

Clock speed has clearly been traded in for more cores in most of server SKUs. But the additional cores can prove extremely useful. The SAP S&D application – one of the best industry benchmarks – runs about three times faster (see further) on the latest Xeon E5-2699 v3 than on the Xeon 5500.

This clearly puts into perspective how important the uncore part is for Xeons. The uncore parts makes the difference between a CPU that is only good at running a few handpicked benchmarks (like SPECint rate) but fails to achieve much in real applications, vs. an attractive product that can lower the IT costs by running more virtual machines and offering services to more users.

Refresher: the Haswell Core The Magic Inside the Uncore
POST A COMMENT

85 Comments

View All Comments

  • bsd228 - Friday, September 12, 2014 - link

    Now go price memory for M class Sun servers...even small upgrades are 5 figures and going 4 years back, a mid sized M4000 type server was going to cost you around 100k with moderate amounts of memory.

    And take up a large portion of the rack. Whereas you can stick two of these 18 core guys in a 1U server and have 10 of them (180 cores) for around the same sort of money.

    Big iron still has its place, but the economics will always be lousy.
    Reply
  • platinumjsi - Tuesday, September 9, 2014 - link

    ASRock are selling boards with DDR3 support, any idea how that works?

    http://www.asrockrack.com/general/productdetail.as...
    Reply
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Well... ASRock is generally famous "marrying" different gen hardware.
    But here, since this is about DDR RAM, governed by the CPU itself (because memory controller is inside the CPU), then my only guess is Xeon E5 v3 may have dual-mode memory controller (supporting either DDR4 or DDR3), similarly as Phenom II had back in 2009-2011, which supported either DDR2 or DDR3, depending on where you plugged it in.

    If so, then probably just the performance of E5 v3 with DDR3 may be somewhat inferior in comparison with DDR4.
    Reply
  • alpha754293 - Tuesday, September 9, 2014 - link

    No LS-DYNA runs? And yes, for HPC applications, you actually CAN have too many cores (because you can't keep the working cores pegged with work/something to do, so you end up with a lot of data migration between cores, which is bad, since moving data means that you're not doing any useful work ON the data).

    And how you decompose the domain (for both LS-DYNA and CFD makes a HUGE difference on total runtime performance).
    Reply
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    No, I hope to get that one done in the more Windows/ESXi oriented review. Reply
  • Klimax - Tuesday, September 9, 2014 - link

    Nice review. Next stop: Windows Server. (And MS-SQL..) Reply
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    Agreed. PCIe Flash and SQL server look like a nice combination to test this new Xeons. Reply
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Xeon 5500 series (Nehalem-EP): up to 4 cores (45 nm)
    Xeon 5600 series (Westmere-EP): up to 6 cores (32 nm)
    Xeon E5 v1 (Sandy Bridge-EP): up to 8 cores (32 nm)
    Xeon E5 v2 (Ivy Bridge-EP): up to 12 cores (22 nm)
    Xeon E5 v3 (Haswell-EP): up to 18 cores (22 nm)

    So, in this progression, core count increases by 50% (1.5 times) almost each generation.

    So, what's gonna be next:

    Xeon E5 v4 (Broadwell-EP): up to 27 cores (14 nm) ?

    Maybe four rows with 5 cores and one row with 7 cores (4 x 5 + 7 = 27) ?
    Reply
  • wallysb01 - Wednesday, September 10, 2014 - link

    My money is on 24 cores. Reply
  • SuperVeloce - Tuesday, September 9, 2014 - link

    What's the story with 2637v3? Only 4 cores and the same freqency and $1k price as 6core 2637v2? By far the most pointless cpu on the list. Reply

Log in

Don't have an account? Sign up now