Last year Qualcomm announced a new tier in its high end SoC roadmap with the Snapdragon 805. Priced somewhat above the current Snapdragon 800/801, the 805 would be the last 32-bit high-end SoC from Qualcomm. It would be the grand finale in Krait's lineage, which started back in 2012 with Krait 200 and MSM8960 and saw iterative improvements over the years. The Snapdragon 805 was not only designed to drive CPU performance higher but also be the launch vehicle for Qualcomm's brand new Adreno 4xx GPU architecture.

The Snapdragon 805 SoC is a beast. It features four Krait 450 cores, each a mild tweak of the Krait 400 design used in the S800/801. These cores can now run at up to 2.7GHz compared to 2.5GHz in the Snapdragon 801 (Krait 400). As always, Qualcomm advertises customer-friendly frequencies rounded up to the nearest 100MHz, the actual max frequency of each Krait 450 core is 2.65GHz (compared to 2.45GHz in Krait 400).

The 8% increase in max frequency comes from tuning at the circuit level, there's no impact to IPC. All four cores sit behind a shared 2MB L2 cache. As is the case with all multi-core Krait SoCs, each CPU core can be power gated, clock gated and even clocked independently of the rest.

The S805 features Qualcomm's Adreno 420 GPU with full support for OpenGL ES 3.1 (with some extensions), OpenCL 1.2 and Direct3D feature level 11_2 (with a hardware tessellation engine). In typical Qualcomm fashion, it isn't disclosing any material details on the underlying Adreno 420 architecture so we'll have to guess based on what the benchmarks tell us. Adreno 420 includes support for Adaptive Scalable Texture Compression (ASTC), a new texture compression first introduced by ARM in 2011.

There are other architectural improvements including better texturing performance and faster depth rejection. The architecture should be more efficient than Adreno 3xx as well, making better use of the underlying hardware.

The GPU runs at a max frequency of 600MHz.

Qualcomm claims a 20% reduction in power consumption compared to Adreno 330 (Snapdragon 800) when running the T-Rex HD test from GFXBench at 1080p (onscreen).

For the first time, the GPU now gets its own direct path to the SoC's memory interface. In the past the GPU shared a bus with the ISP and video engines, but in order to feed the beast that had to change. The memory interface on S805 features two 64-bit LPDDR3-800 partitions (4 x 32-bit external interfaces), each capable of supporting 1600MHz datarate LPDDR3 for an aggregate peak theoretical bandwidth figure of 25.6GB/s. The Krait 450 cores themselves aren't big enough to use all of that memory bandwidth. The wide memory interface is really there for the GPU and video engines. We haven't seen a memory interface this wide on a mobile SoC since Apple's A5X/A6X designs.

In order to accommodate the wider memory interface but still make Snapdragon 805 suitable for use in a smartphone as well as a tablet, Qualcomm turned to a different packaging technology. Since the Snapdragon 805 is an APQ part, it lacks the integrated modem of the MSM SoCs we've found in most of Qualcomm's recent flagships. S805 uses a Moulded Embedded Package (MEP) that allows Qualcomm to route its 128-bit wide memory interface to on-package DRAM, giving it all of the benefits of a PoP stack as well as the wider memory interface. Qualcomm wouldn't provide me with a ton of details on MEP other than to say that rather than using the perimeter of the SoC's package to connect to memory stacked above it, MEP uses a substrate layer on top of the SoC to connect to the memory, giving the SoC more surface area to route lines to the DRAM. Qualcomm also claims the amount of metal it uses in the DRAM's substrate layer has some small impact on improving thermals on the overall package. The result is that Snapdragon 805 is still compact enough to go into a smartphone as long as the design can accomodate a discrete modem.

The Snapdragon 805 also marks Qualcomm's first SoC with a hardware H.265/HEVC video decode engine. There's no hardware H.265 encode acceleration however, that won't come until Snapdragon 810 in 2015.

The S805's ISP sees an increase in performance as well. The SoC retains Qualcomm's dual-ISP design, now capable of pushing up to 1.2 Gigapixels/s through the engine. If Qualcomm arrives at that number the same way as it has in the past, that would imply a 600MHz ISP operating frequency (up from 465MHz in the Snapdragon 801). The new ISP supports up to four MIPI camera inputs (TrioCam + FF anyone?). The ISP can support 4k30 and 1080p120 video capture.

Qualcomm also claims improved autofocus performance and better noise reduction.

Just as in years past, Qualcomm invited us out to a benchmarking workshop to get some hands on time with its Snapdragon 805 Mobile Development Platform (MDP) ahead of actual device availability. And just like we saw with the Snapdragon 800 benchmarking workshop, the S805's MDP comes in tablet form.the Snapdragon 805 MDP/T features a 10.6" 2560 x 1440 display, 3GB of LPDDR3 memory and 64GB of internal storage (eMMC 5.0). The chassis looks very similar to previous MDP/T designs.

Just as before, the benchmarks that follow are of a pre-production device that isn't shipping hardware. Although Qualcomm has significantly improved the delta we've seen between MDPs and shipping devices, there's always the caveat that performance could be different once we are looking at a shipping device, running on battery power. Although Qualcomm gave us access to the MDP/T, the devices were running on AC power with no power instrumentation connected. Qualcomm's own data shows a reduction in power consumption for Snapdragon 805 vs. 800, but once again we'll have to wait for shipping devices to really understand the impact of the SoC on battery life. What follows is exactly what the title of this piece indicates: a preview of Snapdragon 805 performance. Although Qualcomm pre-loaded the MDP/T with some commonly used benchmarks, we installed our own copies of everything we ran.

Qualcomm's Snapdragon 8xx Lineup
  Snapdragon 810 Snapdragon 808 Snapdragon 805 Snapdragon 801 Snapdragon 800
Internal Model Number MSM8994 MSM8992 APQ8084 MSM8974 v3 MSM8974 v2
Manufacturing Process 20nm 20nm 28nm HPm 28nm HPm 28nm HPm
CPU 4 x ARM Cortex A57 + 4 x ARM Cortex A53 (big.LITTLE) 2 x ARM Cortex A57 + 4 x ARM Cortex A53 (big.LITTLE) 4 x Qualcomm Krait 450 4 x Qualcomm Krait 400 4 x Qualcomm Krait 400
ISA 32/64-bit ARMv8-A 32/64-bit ARMv8-A 32-bit ARMv7-A 32-bit ARMv7-A 32-bit ARMv7-A
GPU Adreno 430 Adreno 418 Adreno 420 Adreno 330 Adreno 330
H.265 Decode Yes Yes Yes No No
H.265 Encode Yes No No No No
Memory Interface 2 x 32-bit LPDDR4-1600 2 x 32-bit LPDDR3-933 4 x 32-bit LPDDR3-800 2 x 32-bit LPDDR3-800/933 2 x 32-bit LPDDR3-800/933
Integrated Modem 9x35 core, LTE Category 6/7, DC-HSPA+, DS-DA 9x35 core, LTE Category 6/7, DC-HSPA+, DS-DA - 9x25 core, LTE Category 4, DC-HSPA+, DS-DA 9x25 core, LTE Category 4, DC-HSPA+, DS-DA
Integrated WiFi - - - - -
eMMC Interface 5.0 5.0 5.0 5.0 4.5
Camera ISP 14-bit dual-ISP 12-bit dual-ISP 1.2 GP/s 930 MP/s 640 MP/s
Shipping in Devices 1H 2015 1H 2015 2H 2014 Now Now

I pulled comparison results from our new combined Phone/Tablet 2014 category in Bench. The key comparisons here are the iPad Air (for obvious reasons), ASUS' Transformer Pad TF701T (Tegra 4 in a tablet), ASUS' Transformer Book T100 (Intel's Bay Trail in a tablet) and the HTC One (M8)/Samsung Galaxy S 5 (both are Snapdragon 801 devices). With the exception of the Bay Trail based T100, everything else runs iOS or Android.

CPU Performance
POST A COMMENT

149 Comments

View All Comments

  • testbug00 - Thursday, May 22, 2014 - link

    lol. All right, I'll just ignore the fact that Apple made a custom ARMv8 CPU, the fact it has a GPU team, to match its Uncore and CPU teams, and the fact that there are mobile GPU manufacturers who would reasonably license their overall design to Apple. Also, Imagine Tech (and ARM, and Qualcomm, and Samsung, and others) could call make GPUs that were faster than Tegra K1. But, there is no reason to. Waste of money, waste of power (in the design), etc.

    Oh, and ignore the fact that Apple is willing to spend more money on die. You have to remember that Nvidia is planning to sell these on a profit, that means they need to minimize die space. Typical semiconductor design comes down to Power, Performance and Area. NVidia Tegra seems to follow Performance, Area, Power. Tegra K1 might be the first to go Power, Area, Performance (impossible to tell without real retail products...)

    Apple, on the other hand, targets Power, Performance Area. That means that as long as the chip will fit inside the phone, they would be fine making a 200M^2 die. Making a larger die means you can reduce power due to various reasons. Of course, making a die smaller also allows you to reduce power by shortening distances (this (and lack of interdie connect, larger cache and faster caches) is a reason why Maxwell managed to reduce power so much).

    I am also using historical precidence:
    -NVidia claimed Tegra 2 brought mobile up to parity with Xbo360/PS3 (And Tegra K1, not sure about 3 and 4) which, well, Tegra 2 was not, Tegra K1 will be not (due to bandwidth for the most part, imho. Given it had more bandwidth, it certainly could beat the Xbox 360/PS3)
    -Nvidia showed Tegra 4 beating iPad (did it for Tegra 2 and 3? I don't remember) and it lost upon next iPad.
    -Nvidia claimed Tegra 2 was great pref/watt. And Tegra 3, and Tegra 4. They all were bad compared to Qualcomm (and Apple)

    I don't take Nvidia's claims for much, because, they stink. Hopefully Tegra K1 fixes it. I would rather we did not have a player in the market "die" (read: move to focusing almost wholly on automotive) especially not it being after that company finally got its act together.
    Reply
  • name99 - Thursday, May 22, 2014 - link

    Without going into Apple being faster, it's clearly silly to claim that "all the SOC manufacturers rely on other companies tech to build their GPU". Who is "all the SOC manufacturers"?
    Qualcomm use their own GPU, as does nV. Soon enough so will AMD.

    Apple is the only interesting SoC manufacturer that uses someone else's tech for their GPU, and their are plenty of indications (eg large buildup in GPU HW hires) that this is going to change soon --- if not with the A8, then with the A9.
    Reply
  • fivefeet8 - Thursday, May 22, 2014 - link

    It's hard to have coherent dialog with you going on tangents. For this year it seems the competition from Qualcomm will be the 805 which we now know will not be as performant as the Tegra K1. Reply
  • tuxRoller - Thursday, May 22, 2014 - link

    How do we KNOW this?
    I've struggled to find third part, comprehensive benchmarks for either the mipad or the tk1, especially ones that include power draw (those numbers some random person threw up in a forum aren't tremendously useful, with regards the mipad).
    Also, the adreno420's drivers are, apparently, not well optimized.
    Basically, until AT, toms, or someone similar get their hands on it I won't feel like I know how things stackup.
    Reply
  • fivefeet8 - Friday, May 23, 2014 - link

    There are slides from the IHV presentation for the MiPad showing power usage patterns. Phoronix also did some testing of the TK1 board which shows power usage well below what some seem to be thinking. As for Adreno drivers, they've always been bad and not well optimized.

    https://dolphin-emu.org/blog/2013/09/26/dolphin-em...
    Reply
  • tuxRoller - Friday, May 23, 2014 - link

    When did phoronix release power numbers?
    The mipad presentation looked like copy paste from the Nvidia material.
    The adreno drivers aren't great. Which should tell you how good that hardware really is. Rob Clark, the lead developer of the open source freedreno driver, is already at least matching their performance up to opengl 2.1 gl|es 2. He's mentioned that he's found evidence of the hardware supporting advanced gl extensions not advertised by the driver. This may change as Qualcomm has recently joined linaro so they will probably be seeking to work more with upstream. The end result of that process is always better quality.
    Lastly, don't forget that adreno is a legacy of bitboys, and that Qualcomm is no Intel, even though they are the same size. Qualcomm seems to actually be interested in making top performing GPUs.
    Reply
  • Ghost420 - Friday, May 23, 2014 - link

    isn't Apple's die known to be the biggest of all the SOCs? they can afford to have a big power sucking SOC cuz they can optimize for it. only reason Iphones last long on battery...besides being lower clocked, cuz look how small and dinky that screen is... Reply
  • Ghost0420 - Wednesday, May 28, 2014 - link

    Exactly, it's DOWNCLKED to 600Mhz and still spanking the competition Reply
  • kron123456789 - Thursday, May 22, 2014 - link

    "Guess where it is in the MiPad?

    In the 600Mhz range. " — Proof? BTW, even if it is, MiPad is still more powerful than S805 MDP.
    Reply
  • testbug00 - Thursday, May 22, 2014 - link

    Proof is loose... based off of talking with someone who has worked closely with Nvidia and TSMC in the past (Tesla, Fermi, a few Tegra chips).

    They have been quite accurate before... When the tablet comes we will see it.

    On the other hand, silence tells us a lot also... Where is Nvidia's talking about their "950Mhz GPU" in a tablet? I think the 600Mhz (by that, I should clarify I mean between 600 and 699) clockspeed band is still quite impressive... Just, well, it shows why the chip won't go into Phones...
    Reply

Log in

Don't have an account? Sign up now