CPU Performance & Efficiency: SPEC2006

We move on with our analysis by using SPEC2006 on the Snapdragon 855 QRD. SPEC2006 is an important benchmark as not only does it represent a tool that is used by many companies to architect their CPU designs, but it also a very well understood and academically documented workloads that can serve as a macro-benchmark to determine microarchitectural aspects of a CPU and system.

It’s to be noted that SPEC2006 has been deprecated in favour of SPEC2017, and although we’ll switch to that at some point, for mobile platforms SPEC2006 still represents a good benchmark. Because our scores aren’t official submissions, as per SPEC guidelines we have to declare them as internal estimates from our part.

A Big Note on Power on the QRD

Although for this article I was able to collect power figures for both CPU and GPU workloads, the figures are not of an as high certitude as when measured on commercial devices. The reason for this is that much like last year’s Snapdragon 845 QRD, this year’s 855 platform reports rather high idle power in the 950-1050mW range, about 500mW more than one would expect in a final product. Because our power measurement methodology represents publishing active system power, meaning we measure total power during a given workload and subtract the idle power under the same conditions, there is a degree of uncertainty if the idle power by default is quite high.

Today’s power efficiency figures thus merely represent a guideline – and we’ll make sure to re-test the results once we get our hands on final commercial devices.

The Results – The Snapdragon 855 Performs Admirably

We’ll start off with the aggregate results and drill down in the detailed results later:

The Snapdragon 855 ends up performing extremely well, ending up neck-and-neck with the Kirin 980’s performance, which shouldn’t come as too big of a surprise.

In SPECint2006, the Snapdragon 855 performs 51% better than the Snapdragon 845, all while improving power efficiency by 39% over its predecessor. Against the Kirin 980 which is currently its nearest Android competitor, the Snapdragon just slightly edges ahead by 4%.

In SPECfp2006, the Snapdragon 855 shows an even bigger 61% leap over the Snapdragon 845, and also manages to better showcase the 9% clock speed advantage over the Kirin 980, sporting a similar performance lead.

Again what is most important in these results is the power efficiency figures. One of the things that had me worried during Qualcomm’s Snapdragon 855 launch in Hawaii last month is that the company pretty much avoided talking or publishing any meaningful power efficiency claims on the side of the CPU. Fortunately it seems there wasn’t any need to be concerned as the Snapdragon 855, at first glance, seems to be extremely efficient even on the high clocked 2.85GHz Prime core.

Detailed Results

Drilling down into the detailed results, the one comparison that is most interesting is the performance of the Snapdragon 855 against the Kirin 980. On one hand the Snapdragon 855 is clocked 9% higher as well as promises some tuned microarchitectural characteristics which promise to improve IPC – while on the other hand HiSilicon’s implementation is more straightforward and brings with itself a bigger L3 cache as well as memory latency advantages.

In the vast majority of workloads, both chipsets are neck-and-neck, only diverging in some key aspects. In less memory hierarchy demanding workloads, the Snapdragon more easily is able to showcase its clock speed advantage. In more latency sensitive workloads, this difference shrinks or reverses. 462.libquantum is an interesting result as Qualcomm commented that its lead here is primarily due to the customisations made on the CPU core – although they wouldn’t exactly specify which aspect in particular is bringing the boost.

The biggest performance discrepancy on the negative side of things is the 13% disadvantage in 458.sjeng – the benchmark is most sensitive to branch mispredictions and again here Qualcomm has stated they’ve made changes to the branch data structures of the core.

What is most odd for me to see as a result, is the fact that 429.mcf performs admirably well on the Snapdragon 855 – which goes against intuition given the platform’s memory latency disadvantage. It is possible here that the Snapdragon 855 performs better than the Kirin 980 due to its better L3 cache latency?

On the SPECfp2006 results, the results can be very clearly categorised into two sets: In one set the Snapdragon 855 clearly showcases a healthy advantage over the Kirin 980, up to very notable 17% and 22% leads in 447.dealII and 453.povray. In the other set, the Snapdragon is again neck-and-neck with the Kirin 980, and these happen to again be the workloads that are most memory sensitive in the FP suite.

Overall, the Snapdragon 855’s CPU performance does not disappoint. Performance on average is ahead of the Kirin 980, although not by much. Here both chipsets are most of the time neck-and-neck, and it will mostly depend on the workload which of the two will take the lead.

More important than performance, the efficiency of the Snapdragon 855 is top-notch, exceeding what I had expected from the higher clock implementation of the chip. There is still a degree of uncertainty over the power numbers on the QRD platform, but if these figures are representative of commercial devices, then 2019’s flagship will see excellent battery life.

Introduction & Specifications Inference Performance: Good, But Missing Tensor APIs
POST A COMMENT

132 Comments

View All Comments

  • tipoo - Tuesday, January 15, 2019 - link

    Untrue. Apples cores are wider, deeper, more OoO than anything else in mobile, and use massive caches at that. You have it reversed, if Android could use the A12 it would post impressive benchmarks, it's hardware design.

    Low level benchmarks are meant to remove the OS from the equation. Proof is in the pudding.
    Reply
  • goatfajitas - Tuesday, January 15, 2019 - link

    The A12 is a great CPU, but it's not magic. It's all ARM. The difference is in the implementation and control that Apple has with integration. Whatever though, both ways have benefits and downsides. I am just saying that people that think it's all about this CPU that is somehow years ahead of everyone else are mistaken as to the reality of the situation. Suffice to say, it's all fast. Reply
  • axius81 - Tuesday, January 15, 2019 - link

    This just doesn't make sense. "It's all ARM." Yeah, sure, and one companies implementation of that instruction set can absolutely be superior.

    That's like saying "It's all x86 / x86-64." when we're comparing AMD and Intel. One can *absolutely* be faster than the other at implementing that instruction set - and in practice, is.

    Apple makes amazing ARM chips, irrespective of iOS.
    Reply
  • goatfajitas - Tuesday, January 15, 2019 - link

    They are great chips, I am just saying they are not (hardware wise) way beyond what the competition is doing. Alot of that performance is OS, tight integration with apps, drivers, API's etc as its all controlled by one company. That isnt a bad thing, that is a good thing for Apple customers. Reply
  • techconc - Tuesday, January 15, 2019 - link

    Actually, Apple is significantly ahead of what the competition is doing with ARM based chips. This can be objectively measured. Reply
  • tipoo - Wednesday, January 16, 2019 - link

    What do you call their massive cache and issue width advantage if not being hardware wise beyond the competition? It's not magic, but Apple is clearly spending more on die area than Qualcomm is. Reply
  • bji - Tuesday, January 15, 2019 - link

    Yeah I don't think you know what you're talking about. I think you read somewhere that some of Apple's performance/stability superiority over Android come from Apple controlling the whole stack and you've generalized that into places where the statement just isn't true. Reply
  • techconc - Tuesday, January 15, 2019 - link

    You seem to conflate the ARM instruction set with the actual design of the chip. You then play off Apple's obvious advantages as some sort of magic... err.. "integration" as you call it. That's nonsense. You might be able to claim that for a specific application, but not for generic benchmarks. Reply
  • tipoo - Wednesday, January 16, 2019 - link

    I didn't say it was magic. I said it's not entirely down to some ambiguous "optimization" with the OS. The cores themselves are physically impressive regardless of OS.

    "It's all ARM."

    This shows me you may have missed crucial step, Apple is only licencing the ARM instruction set, but otherwise they design the whole very wide, deep, very OoO core themselves.
    Reply
  • tipoo - Wednesday, January 16, 2019 - link

    I didn't say it was magic. I said it's not entirely down to some ambiguous "optimization" with the OS. The cores themselves are physically impressive regardless of OS. It's when people play it off as some pie in the sky optimization advantage that they're claiming magic, you can't make a 3-wide Braswell core fly just with vertical integration.

    "It's all ARM."

    This shows me you may have missed crucial step, Apple is only licencing the ARM instruction set, but otherwise they design the whole very wide, deep, very OoO core themselves.
    Reply

Log in

Don't have an account? Sign up now