Qualcomm Snapdragon S4 (Krait) Performance Preview - 1.5 GHz MSM8960 MDP and Adreno 225 Benchmarksby Brian Klug & Anand Lal Shimpi on February 21, 2012 3:01 AM EST
- Posted in
We won't go too deep into Krait's CPU architecture, because we've already done so in an earlier piece. What we can provide however is a quick recap. Architecturally Krait isn't a design of tradeoffs, rather it's a significant step forward along almost all vectors. Each core can fetch, decode and execute more instructions in parallel than its predecessor (Scorpion, Snapdragon S1/S2/S3).
|Qualcomm Architecture Comparison|
|Pipeline Depth||10 stages||11 stages|
|L2 Cache (dual-core)||512KB||1MB|
|Core Configurations||1, 2||1, 2, 4|
Even if you're not comparing to Qualcomm's previous architecture, Krait maintains the same low level advantage over any other ARM Cortex A9 based design (NVIDIA Tegra 2/3, TI OMAP 4, Apple A5). Clock speeds are up with only a small increase in pipeline depth. The combination of these two factors alone should result in significant performance improvements for even single threaded applications. If you want to abstract by one more level: Krait will be faster regardless of application, regardless of usage model. You're looking at a generational gap in architecture here, not simply a clock bump.
|ARM11||ARM Cortex A8||ARM Cortex A9||Qualcomm Scorpion||Qualcomm Krait|
|Pipeline Depth||8 stages||13 stages||8 stages||10 stages||11 stages|
|Out of Order Execution||N||N||Y||Partial||Y|
|FPU||VFP11 (pipelined)||VFPv3 (not-pipelined)||Optional VFPv3 (pipelined)||VFPv3 (pipelined)||VFPv4 (pipelined)|
|NEON||N/A||Y (64-bit wide)||Optional MPE (64-bit wide)||Y (128-bit wide)||Y (128-bit wide)|
|Typical Clock Speeds||412MHz||600MHz/1GHz||1.2GHz||1GHz||1.5GHz|
The memory interface of the chip has been improved tremendously. At a high level, the MSM8960 is Qualcomm's first SoC to feature PoP support for two LPDDR2 memory channels. We suspect there are lower level improvements to the memory interface as well however we don't have more details from Qualcomm, not to mention the current state of memory latency/bandwidth testing on Android is pretty abysmal.
Quantifying the Krait performance advantage requires a mixture of synthetic and application level tests. We'll start with Linpack, a Java port of the classic memory bandwidth/FPU test:
Occasionally we'll see performance numbers that just make us laugh at their absurdity. Krait's Linpack performance is no exception. The performance advantage here is insane. The MSM8960 is able to deliver more than twice the performance of any currently shipping SoC. The gains are likely due in no small part to improvements in Krait's cache/memory controller. Krait can also issue multi-issue FP instructions, A9 class architectures can apparenty only dual-issue integer instructions.
Krait and the MSM8960 are 20 - 35% faster than the dual-core Cortex A9s used in Samsung's Galaxy Nexus. For a look at how overall web page loading is impacted we loaded AnandTech.com three times and averaged the results. We presented results with the browser cache cleared after each run as well as results after all assets were cached:
|AnandTech.com Page Loading Comparison (Stock ICS Browser)|
|Browser Cache Cleared||Cache In Use|
|Qualcomm MDP MSM8960 (Krait)||5.5 seconds||3.0 seconds|
|Samsung Galaxy Nexus (ARM Cortex A9)||5.8 seconds||4.4 seconds|
There's hardly any advantage when you're network bound, which is to be expected. However whenever the device can pull assets from a local cache (something that is quite common as images, CSS and even many page elements remain static between loads) the advantage grows considerably. Here we're seeing a 46% advantage from Krait over the Cortex A9 in the Galaxy Nexus.
We turn to Qualcomm's own Vellamo as a system/CPU/browser performance test:
Again, we're showing a huge performance advantage here thanks to Krait. Seeing as how Vellamo is a Qualcomm benchmark don't get too attached to the advantage here, but it does echo some of what we've seen earlier.
Finally we have Rightware's Basemark OS 1.1 RC which is fast becomming an impressively polished system benchmark, one which will hopefully eventually take the place of the likes of Quadrant.
|Basemark OS - System|
|HTC Rezound||Galaxy Nexus||MDP MSM8960|
|System Overall Score||658||538||907|
|Simple Java 1||298 loops/s||210 loops/s||375 loops/s|
|Simple Java 2||7.28 loops/s||8.61 loops/s||10.8 loops/s|
|SMP Test||35.3 loops/s||49.2 loops/s||64.4 loops/s|
|100K File (eMMC->SD)||6.49 mB/s||9.52 mB/s||8.64 mB/s|
|100K File (SD->eMMC)||33.0 mB/s||17.8 mB/s||39.8 mB/s|
|100K File (eMMC->eMMC)||37.8 mB/s||34.5 mB/s||48.9 mB/s|
|100K File (SD->SD)||8.47 mB/s||8.30 mB/s||12.7 mB/s|
|Database Operation||10.0 ops/s||5.73 ops/s||19.4 ops/s|
|Zip Compression||0.509 s||0.848 s||0.561 s|
|Zip Decompression||0.097 s||0.206 s||0.073 s|
On the CPU centric tests Basemark OS is showing anywhere from a 20% - 80% increase in performance over the 1.5 GHz APQ8060 based HTC Rezound. IO performance is also tangibly improved although that could be a function of NAND performance rather than the SoC specifically.
These results as a whole simply quantify what we've felt during our use of the MSM8960 MDP: this is the absolute smoothest we've ever seen Ice Cream Sandwich run.
Post Your CommentPlease log in or sign up to comment.
View All Comments
sciwizam - Tuesday, February 21, 2012 - linkAny thoughts on how Krait will compare against the A15 chips and when's the earliest will those be on market?
wapz - Tuesday, February 21, 2012 - linkLook in the architecture article on page 1 here: http://www.anandtech.com/show/4940/qualcomm-new-sn...
"ARM hasn't published DMIPS/MHz numbers for the Cortex A15, although rumors place its performance around 3.5 DMIPS/MHz."
Krait has 3.3 DMIPS/MHz, so if a dual Cortex A15 would run at the same frequency they would be fairly comparable I would imagine (obviously ignoring all other elements that could help performance on either of them).
wapz - Tuesday, February 21, 2012 - linkAnd if that's the case, HTC will have an interesting problem with their new lineup. It would mean, if the rumours are correct, their new flagship One X model using Tegra3 AP33 chipset at 1,5GHz and a 4,7 inch 720p screen might be slower compared to the One S, sporting the Snapdragon S4 chipset and a 4,3 inch screen with qHD.
Lucian Armasu - Tuesday, February 21, 2012 - linkIn that case even if the GPU's were equal in performance, the One S would be faster due to the fact that it uses a lower resolution.
zorxd - Tuesday, February 21, 2012 - linkIf by "faster", you mean more FPS in a 3D game, then yes.
The resulting image quality would be lower however.
metafor - Tuesday, February 21, 2012 - linkYes. FPS is only one factor in the overall equation of user experience. Higher resolution rendering is definitely preferable assuming one could maintain ~60fps.
zorxd - Tuesday, February 21, 2012 - linkI'd rather have a 34 fps 1280x720 than a 60 fps 960x540.
sosrandom - Tuesday, February 21, 2012 - linkor better graphics at 30fps @ 960x540 than 30fps @ 1280x720
trob6969 - Wednesday, February 22, 2012 - linkI agree, i would take quality over speed any day as long as the difference in speed is measured in mere seconds.
vol7ron - Thursday, March 1, 2012 - linkI can't wait to see the power savings, especially since the modem is a huge power draw, and one of the benefits is that QC is the manufacturer, which means integrated chip and less power consumption (as well as thinner device).