AMD Carrizo Part 2: A Generational Deep Dive into the Athlon X4 845 at $70
by Ian Cutress on July 14, 2016 9:00 AM ESTThe latest microarchitecture from AMD based on the x86 instruction set was given the codename Excavator, using the fourth generation of AMD's Bulldozer cores, called Carrizo cores. Carrizo and Excavator were primarily aimed at laptops and is an important part of the efficiency goals AMD has set itself. We tested some 15W laptops earlier in the year, but when AMD announced a 65W part was coming to desktop, we actively sourced a part to compare generational performance improvements in a like-for-like setting. This is that review, and we're testing the Athlon X4 845 and its microarchitectural counterparts through the years: the Athlon X4 860K with Kaveri cores and the Steamroller µArch, the Athlon X4 760K with Richland cores and an improved Piledriver µArch, and the Athlon X4 750K with Trinity cores using the original Piledriver µArch.
AMD's Future in Mainstream
Both of the main x86 processor manufacturers, AMD and Intel, broadly arrange their consumer processors into three segments: high-performance, mainstream and entry level. As one might expect, these processor lines differ in terms of performance, price and power (and a few quirks therein). The story from AMD's side over the past several years has been one where the high-performance line has drifted away, but still present in parts originally released in October 2012, and the mainstream line is AMD's current source of revenue and market share for CPUs. Whereas the high-performance processors focus on being pure CPUs, designed for general purpose function, the mainstream integrates both processing and graphics parts into a single silicon die such that a system does not need a discrete graphics card in order to provide an output (and AMD calls this an APU, an Accelerated Processing Unit; Intel has no specific name). Both AMD and Intel have made this their bedrock for the mainstream platform, allowing users to invest in a component and rely on integrated graphics if they only need that level of performance, or moving up to a graphics card when budget allows or performance is required. The current level of performance with some high-end AMD APUs matches a graphics performance similar to that of a $50-80 graphics card, making them target purchases for budget machines. Processors with integrated graphics feature heavily in laptop and notebook designs as well, where saving space, power and cost are often priorities.
Some users who rely on mainstream components but want a discrete graphics platform are sometimes felt hard done by with the integrated graphics design being the main option at the price point. For these users, who are buying a processor and then a $150+ discrete graphics card, paying for an integrated graphics portion of the CPU that goes unused feels unhelpful: the silicon area becomes excess baggage, and they don't want to pay for it. AMD has had features such as Dual Graphics in the past, where the APU and discrete GPU work together which can work well, but relies on good driver and game support to do so. It also focuses on improving low-to-mid range hardware, rather than going for peak performance. DirectX 12 may change this, with the new graphics API allowing developers to use integrated graphics in different ways, but again it relies on game developer support and might still be a few years out from becoming common place. So why pay the extra for integrated graphics when you do not need it? Intel does not offer much of an option here, aside from spending a minimum of $450 on their high-end desktop platform. However AMD has you covered in the Athlon line of CPUs.
The APU line accounts for a bulk of AMD's mainstream desktop processor sales, with a dozen APUs released each generation. Alongside these designs, AMD also releases the CPU-only Athlon line. These are the same silicon designs as the APUs but are cut down versions without the integrated graphics. Technically they still have the internal silicon for the graphics cores, but due to silicon defects or stock management, it is physically disabled and the price is subsequently reduced. This method of binning is not new, and happens with many silicon processor designs - when processors are made, they have a natural defect rate (a manufacturing process with a low defect rate is said to be more 'mature'). If these defects are in areas that can be disabled in the binning process, it allows a company like AMD to still sell the processor cheap rather than completely throw it away. For users intending to have discrete graphics, the Athlon line can thus be a significantly cheaper option when building a mainstream AMD PC. The saving made can then be funneled into other upgrades, such as double memory, a bigger SSD, or even another stage up on a graphics card performance list.
The AMD Athlon X4
So while the APU, with the integrated processor and graphics, is AMD's main focus for mainstream sales, the Athlon line is present as a way to throw away less silicon and offer a component with a given feature set to users who want it. For users who have building PCs for many years, the Athlon name in AMD's history has been a steadfast reminder of when AMD was winning the x86 wars. Before APUs were becoming a reality, most of the mainstream parts from AMD were labelled Athlon, from single core up to four cores in the Athlon X4 family (which we retested recently for a future review), with K10 based parts being called Phenom, high-performance segment parts moving to 'FX', and a number of Sempron parts as well.
When moving to the Bulldozer based microarchitecture in Q4 2012, and the launch of the Trinity core design, AMD has kept a small number of Athlon X2/X4 parts around each generation, often being very price competitive with the APUs. For this review, we've taken one of each generation and tested accordingly.
The following table shows every AMD CPU-only processor from 2012. The information comes from a variety of sources, mostly CPU-World and the AMD CPU Wiki, but surprisingly no central source of information (like Intel's ARK) exists. The information in the table is quite dense (there's only so much you can fit into 666 pixels wide), but the poignant parts to keep track of are the PCIe counts, release dates and the cache sizes.
AMD CPU-only Processors From 2012 | |||||||||||||
µArch / Core |
Release | Cores | Base Turbo |
TDP / PCIe |
Socket DDR3 |
L1 (I) Cache |
L1 (D) Cache |
L2 Cache |
|||||
Athlon X4 845 |
Excavator Carrizo |
2/2016 | 4 | 3500 3800 |
65 W 3.0 x8 |
FM2+ 2133 |
192KB 3-way |
128KB 8-way |
2 MB 16-way |
||||
Athlon X4 880K |
Steamroller Kaveri v2 |
3/2016 | 4 | 4000 4200 |
95 W 3.0 x16 |
FM2+ 2133 |
192KB 3-way |
64KB 4-way |
4 MB 16-way |
||||
Athlon X4 870K |
12/2015 | 3900 4100 |
FM2+ 1866 |
||||||||||
Athlon X4 860K |
Steamroller Kaveri |
8/2014 | 3700 4000 |
||||||||||
Athlon X4 840 |
8/2014 | 3100 3800 |
65 W 3.0 x16 |
||||||||||
Athlon X4 830 |
2014? | 3000 3400 |
|||||||||||
Athlon X2 450 |
8/2014 | 2 | 3500 3900 |
96KB 3-way |
32KB 4-way |
1 MB 16-way |
|||||||
FX-770K (OEM) |
12/2014 | 4 | 3500 3900 |
FM2+ 2133 |
192KB 3-way |
64KB 4-way |
4 MB 16-way |
||||||
Athlon X4 760K |
Piledriver.v2 Richland |
7/2013 | 4 | 3800 4100 |
100 W 3.0 x16 |
FM2 1866 |
128KB 2-way |
64KB 4-way |
4 MB 16-way |
||||
Athlon X4 750 |
10/2012 | 3400 3900 |
65 W 3.0 x16 |
||||||||||
Athlon X2 370K |
6/2013 | 2 | 4000 4200 |
64KB 2-way |
32KB 4-way |
1 MB 16-way |
|||||||
Athlon X2 350 |
2013? | 3500 3900 |
|||||||||||
Sempron X2 250 |
2013? | 3200 3600 |
FM2 ? |
||||||||||
FX-670 (OEM) |
3/2014 | 4 | 3700 4300 |
FM2 1866 |
128KB 2-way |
64KB 4-way |
4 MB 16-way |
||||||
Athlon X4 750K |
Piledriver Trinity |
10/2012 | 4 | 3400 4000 |
100 W 3.0 x16 |
FM2 1866 |
128KB 2-way |
64KB 4-way |
4 MB 16-way |
||||
Athlon X4 740 |
10/2012 | 3200 3700 |
65 W 3.0 x16 |
||||||||||
Athlon X2 340 |
10/2012 | 2 | 3200 3600 |
FM2 1600 |
64KB 2-way |
32KB 4-way |
1 MB 16-way |
||||||
Sempron X2 240 |
2012? | 2900 3300 |
FM2 ? |
There are some things to note here, in case anyone is following:
- The Athlon X4 845 is the only part (CPU or APU) that will be released using Carrizo cores for the desktop using DDR3. There are reports of an X4 835 (lower frequency) codename being used, but there is no confirmation this part will exist/be released in any form. However, there will be no APU desktop versions of Carrizo with DDR3, for the reasons below.
- The Athlon X4 845 is actually a laptop APU in desktop clothing, and as such has some limitations in having eight PCIe 3.0 lanes.
- Moving from Richland to Kaveri gives 50% more L1 (I) cache, moving from 64KB/module to 96KB/module and from 2-way to 3-way associativity.
- Moving from Kaveri to Carrizo gives 100% more L1 (D) cache, moving from 32KB/module to 64KB module and from 4-way to 8-way associativity.
- There is a Trinity CPU called the Athlon X4 750K, and a newer Richland CPU called the Athlon X4 750. In researching this review, trying to find the latter was tough, as this was an OEM only part, but it does exist.
- Every dual core/single module design from AMD has 1 MB of L2, whereas every quad core/dual module design has 4 MB of L2. The exception to this is the Carrizo based Athlon X4 845.
A Brief Update on Carrizo
Back at AMD's Tech Day in 2015, AMD gave us a look into their new core design, Carrizo, using the updated Excavator microarchitecture. That link is worth a read to understand Carrizo as it stood at that time, with a brief recap here. As part of the discussions, we were shown a plethora of ways in which AMD had upgraded their core design. One of the major drivers for this was the march towards their goal of achieving 25x better energy efficiency by 2020 (counting from 2014/Kaveri).
Among the changes was better core scheduling for threads, and a better frequency/voltage scaling mechanism to deal with power spikes and droops to keep overall power consumption lower.
A change in the metal stack layers making the whole piece of silicon more GPU like in the design, affording higher density and power efficiency characteristics.
Excavator, and by extention Carrizo, was touted in the press as being the biggest upgrade to the base Bulldozer design since the introduction of Bulldozer itself. This sentiment came from the redesigned high density silicon libraries for various logic operations. Rather than optimize the libraries for performance, AMD redesigned them almost from scratch, shifting the paradigm of continual optimization to size. This led to a significant decrease in die area at the cost of only a little headroom in frequency but also a power saving.
The other caveat is that a processor core is typically designed for a certain power window. So a 4-core CPU design that ends up in 35W and 90w processors must run between 8W and 22W per core in perfect operation. The wider the window, the more compromises that have to be made to the design to cope with high frequency/power units in order to get regular deterministic operation. AMD aimed their dual module Carrizo design squarely at 15W for laptops and mobile devices, although the high-end parts could also offer a 35W boost mode, depending on the device manufacturer.
At the tech day, AMD were careful to point out that at 35W, the efficiency of Carrizo will be on par in terms of performance with the previous generation Kaveri, meaning the only benefits would be the improved power saving (and video playback capabilities for parts with the integrated graphics). If the graphical representation of this from AMD is anything to go by, it would even suggest a performance regression with higher power consumption. To put that in terms of today's review, the Athlon X4 845 runs at 65W.
What This Means
Despite the mobile focused design, AMD decided to release a single Carrizo core based part (using DDR3) for the desktop. The Athlon X4 845 comes with a lot of caveats compared to the mobile parts: no integrated graphics in exchange for a much higher 65W TDP and a small bump in frequency. Desktop owners will be careful as well, given the mobile parts only had eight lanes of PCIe 3.0 for graphics, and this continues for the desktop part. This limits the X4 845 to single GPU configurations as a focal point.
So all in all, the X4 845 should be heading in the bin: a high powered, low efficiency Carrizo that should perform on par or worse with similarly rated Kaveri APUs. Unfortunately we weren't able to source identical TDP units for this review, but as the IPC comparison will show, Carrizo and the Excavator microarchitecture is a big step forward in the Bulldozer family of microarchitectures over the Steamroller core and the Kaveri design.
This Review
I wanted to test a number of degrees of freedom with this review, especially as it becomes a precursor of what many people are expecting to see before Zen is released on the AM4 platform. First of all, we look at the generational performance of four Athlon going through the years.
- The Athlon X4 845, Carrizo cores with Excavator micro-architecture
- The Athlon X4 860K, Kaveri cores with Steamroller micro-architecture
- The Athlon X4 760K, Richland cores with Piledriver v2 micro-architecture
- The Athlon X4 750K, Trinity cores with Piledriver micro-architecture
Some of these parts were sampled, others were purchased for the review.
AMD Athlon X4 845, Carrizo (left)
AMD Athlon X4 860K, Kaveri (right)
AMD Athlon X4 760K, Richland (left)
AMD Athlon X4 750K, Trinity (right)
To start, we deep dive into the performance of the architecture. For this, all four processors are set to a fixed 3 GHz for our tests, including games with our set of GPUs. The goal here is to see how the core logic adapts in single threaded benchmarks, or do adequate operation and memory allocation in multithreaded workloads. One of the main goals with the new iterations of the Bulldozer floorplan has been to actively use the right cores with the right scheduling to avoid stalls and provide better prediction methods for future memory requirements.
Then we move on to how the Athlon X4 845 overclocking section. As this is not a K processor, we are rather limited in what we can do, but given that this is a mobile-focused part we can test to see if as AMD is near the limit of the core power design or if there is still room at the top.
To finish off, we'll have a number of benchmark results showing the X4 845 against processors from our database that cost a similar amount. The obvious competition here is the dual-core Intel Pentium G3258, which is an overclocking focused part that has a retail price of $72. We will also add in a high-cost APU to determine the performance differential. This doesn't take into account system to system costs, such as additional $ for coolers or motherboards, as these can be variable.
Pages In This Review
AMD's Carrizo Thoroughly Tested Part 2: Introduction
Test Bed and Setup
Benchmark Overview
Performance at 3 GHz: Real World
Performance at 3 GHz: Office
Performance at 3 GHz: Linux
Performance at 3 GHz: Legacy
Gaming at 3 GHz: Alien Isolation
Gaming at 3 GHz: Total War Attila
Gaming at 3 GHz: Grand Theft Auto
Gaming at 3 GHz: Grid Autosport
Gaming at 3 GHz: Shadow of Mordor
Analyzing The Improvements
AMD Athlon X4 845 Overclocking: A Non-Starter
Stock Comparison: Real World
Stock Comparison: Office
Stock Comparison: Linux Bench
Stock Comparison: Legacy and Synthetic
Gaming Comparison: Alien Isolation
Gaming Comparison: Total War: Attila
Gaming Comparison: Grand Theft Auto
Gaming Comparison: Grid Autosport
Gaming Comparison: Shadow of Mordor
Power Consumption
131 Comments
View All Comments
lefty2 - Thursday, July 14, 2016 - link
I'm predicting Bristol Ridge will be just as bad a failure as Carrizo. I.e. the few design wins will only have single DIMM memory and be universally unavailable, buried somewhere in a dark corner of the OEM's website. It's a pity, because both SoCs are very good in their own right.nandnandnand - Thursday, July 14, 2016 - link
If it's not Zen, it can be thrown straight in the garbage.Samus - Friday, July 15, 2016 - link
I still rock a few Kaveri desktops and they are incredibly powerful for the price. The 860K is half the cost of a comparable Intel chip, which supporting faster memory and a lower cost platform.Carizo on the desktop is an anomaly. I'd like to see what it could do with 4MB cache (would require an entirely new die)
Lolimaster - Saturday, July 16, 2016 - link
They were nice in 2014.We should have a nice 20nm 768SP APU in 2015 with a full L2 cache Excavator and fully mature 896SP 20nm early this year.
Remember the A8 3870K? That APU was a damn monster only hold back from being godly cause of their sub 3Ghz cpu speed, what we had after?
400SP VLIW5 2011 --> 384 VLIW4 2012 --> 384VLIW4 2013 --> 512SP GCN 2015 --> 512SP GCN 2016
Intel improved way faster (non "e" + edram igp's are near A8 level from being utter trash when the A8 3850 was release).
The_Countess - Tuesday, July 19, 2016 - link
yes being able to thrown in a extra billion transistors compared to AMD (1.7 vs 0.75 billion transistors for a quad core with GPU) because of 14nm really does help intel along a lot.but as nobody has been able to make a 20nm class process for anything but flash and ram besides intel, AMD's hands were tied. there is nothing AMD could have done to change that.
BlueBlazer - Friday, July 15, 2016 - link
Formula for failure: FM2 socket (with limited CPU upgradeability), only PCI Express x8 lanes available (which can bottleneck GPUs), and only "4 cores" (which performs more like 2C/4T Core i3 processor).neblogai - Friday, July 15, 2016 - link
Bristol Ridge is not FM2; PCI-E x8 can not bottleneck midrange GPUs; ultra low power mobile APU also sold as desktop chip is not a failure, just additional revenueBlueBlazer - Friday, July 15, 2016 - link
The results in the article shows otherwise, where AMD's Bristol Ridge was slower in most gaming tests, despite having better performance in some applications. Both FM2 and FM2+ are still the same (legacy) socket. AMD will be probably selling these chips at a loss. Note that these are the same (large) dies as Carrizo chips, and at 250mm^2 coupled with low prices typically meant razor thin margins or none at all.silverblue - Friday, July 15, 2016 - link
That L2 cache is probably making more difference than you realise.evolucion8 - Saturday, July 16, 2016 - link
The PCI-E is busted, even at PCI E 2.0 @ 4X, it barely makes a difference on the Fury X and the GTX 980 Ti.