Java Performance

The SPECjbb 2015 benchmark has "a usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases, and data-mining operations." It uses the latest Java 7 features and makes use of XML, compressed communications, and messaging with security.

We tested with four groups of transaction injectors and backends. The reason why we use the "Multi JVM" test is that it is more realistic: multiple VMs on a server is a very common practice.

The Java version was OpenJDK 1.8.0_91. We applied relatively basic tuning to mimic real-world use, while aiming to fit everything inside a server with 128 GB of RAM:

"-server -Xmx24G -Xms24G -Xmn16G -XX:+AlwaysPreTouch -XX:+UseLargePages"

The graph below shows the maximum throughput numbers for our MultiJVM SPECJbb test.

SPECJBB 2015-Multi Max-jOPS

The Critical-jOPS metric is a throughput metric under a response-time constraint.

The 8-core Tyan POWER8 server offers about 72% of the performance of the 10-core IBM S812LC. That is not too bad as the latter not only has 20% more cores, but the chip can also boost 16% higher. In total, the IBM POWER8 CPU inside the 2U S812LC offers about 45% greater processing power ("35 GHz" vs "24 GHz") and delivers about 40% better performance. So compared to the S812LC, the 1U Tyan delivers very decent performance.

But Intel is the one to beat. And by caging the POWER8 inside a 1U, performance has dropped below the power sipping (90W TDP!) Xeon E5-2640v4.

SPECJBB 2013-Multi Critical-jOPS

Meanwhile our next benchmark is a good reminder that OpenJDK 8's performance is not optimal for the POWER8. The IBM JDK (More details here) does not offer much better throughput, unless you start tuning frantically. However, it does increase the most important score, critical-jOPS, with reasonable tuning.

However, while the more powerful 2U POWER8 can still keep up with Intel's best and most expensive (only 9% slower), the frequency capped CPU inside the Tyan 1U fails to impress as it trails the less expensive and less power hungry Xeon E5-2640 v4 by a large margin.

Benchmark Configuration and Methodology Database Performance: MySQL 5.7.0
Comments Locked

28 Comments

View All Comments

  • Einy0 - Friday, February 24, 2017 - link

    Articles like these make me wonder if some of these companies using IBM eServer iSeries(AS/400) as mid-level servers are wasting their money. I was always under the impression that Power was suppose to be tuned for database heavy workloads and hence have a massive advantage in doing so. I know the iSeries servers run an OS with DB2 built-in and tuned specifically for it but how much of an advantage does that really equate to?
  • FunBunny2 - Friday, February 24, 2017 - link

    -- I know the iSeries servers run an OS with DB2 built-in and tuned specifically for it but how much of an advantage does that really equate to?

    unless IBM has done a complete port recently, AS/400 "integrated database" was built before server versions of DB2 existed. it's/was just a retronym.
  • kfishy - Friday, February 24, 2017 - link

    As ISAs becoming more and more relevant in the post-Moore's law world, where you can't solve a computational problem just by throwing ever more transistors at it, I wonder if this opens up opportunity for POWER to carve out niches left out by Intel's more fixed and general purpose approach.

    At the same time, POWER will have to contend with a nascent but rising and truly open ISA in RISC-V, where companies can simply implement the subsets of the ISA that they need. The next decade in processor architecture is going to be interesting to watch.
  • FunBunny2 - Friday, February 24, 2017 - link

    -- As ISAs becoming more and more relevant in the post-Moore's law world, where you can't solve a computational problem just by throwing ever more transistors at it

    given that ISA has been reduced to z, ARM, and X86 not counting Power, of course. and ARM might not really qualify as equivalent. for those ancient enough, or well read enough, know that up to and during the "IBM and 7 Dwarves" era, ISA and even base architecture, made a varied ecosystem. not so much anymore. and I doubt anyone will invent a more efficient adder or multiplier or any other subunit of the real CPU. just look at the screen shots of chips over the last couple of decades: the real CPU area of a chip is nearly disappeared. in fact, much (if not most) of the transistor budget for some years has been used for caching, not ISA in hardware. so called micro-architecture is just a RISC CPU, and the rest of the chip is those caches and ever more complicated "decoder". that and integrating what had previously been other-chip functions. IOW, approaching monopoly control of compute.

    I expect the next decade to be more of the same: more cache and more off-chip function brought on chip. actual CPU ISA, not so much.
  • aryonoco - Saturday, February 25, 2017 - link

    Thank you Johan. Great article.

    Not all AnandTech articles live up to the standards set in the days past, but your articles continue your own excellent standards.

    Very much looking forward to POWER 9 chips. Hopefully they have also done the work to port the toolchain and important software already to it this time and we won't have to wait another 12 months after release to be able to compile normal Linux programs on it.

    Also, 12 fans running at 15,000 rpm in a 1U? What did that sound like?! Wow!
  • JohanAnandtech - Sunday, February 26, 2017 - link

    Thx Aryonoco. Not all of those 12 fans were running at top speed, but imagine a Jumbo jet taking off sound. It clearly show how hard it is to cool IBM's best in a 1U: you have to limit the clockspeed to about 2/3 of what it is capable off and double the number of fans.
  • yuhong - Wednesday, March 1, 2017 - link

    "Unfortunately, the latest 8 Gbit based DIMMs are not supported."
    Micron don't make these chips anymore:
    http://media.digikey.com/pdf/PCNs/Micron/PCN_32042...
    Interestingly, Crucial is selling 32GB DDR3 quad rank RDIMMs again (but not LR-DIMMs):
    http://www.crucial.com/usa/en/ct32g3erslq41339
  • mystic-pokemon - Sunday, March 5, 2017 - link

    For folks who are saying that POWER only looks good on paper. NOT true.

    I know shit ton of stuff about one of the server Johan listed above. He has a point when he says Power consumption is only so much important.
    In short, when you combine all aspects to TCO model: POWER8 server delivers most optimal TCO value
    We consider all the following into our TCO model
    a) Cost of ownership of the server
    b) Warranty (Lesser than conventional server, different model of operations)
    c) What it delivers (How many independent threads (SMT8 on POWER8 remember ? 192 hardware threads), how much Memory Bandwidth (230 GBPs), how much total memory capacity in 1 server ( 1 TB with 32 GB)
    d) For a public cloud use-case, how many VMs (with x HW threads and x memory cap / bw ) can you deliver on 1 POWER8 server compared to other servers in fleet today ? Based on above stats, a lot .
    e) Data center floor lease cost in DC ( 24 of these servers in 1 Rack, much denser. Average the lease over age of server: 3 years ). This includes all DC services like aggers, connectivity and such.
    f) Cost per KWH in the specific DC ( 1 Rack has nominal power 750W)

    All this combined POWER has good TCO. Its a massively parallel server, what where major advantage comes from. Choose your workload wisely. That's why companies continue to work on it.

    I am talking about all this without actually combining with CAPI over PCIe and openCAPI. With POWER9 all this is getting even better. Get it ? POWER is going no where.

Log in

Don't have an account? Sign up now