Indeed. Biggest improvements since sandybridge. If you look at the timeline, this wouldve been the first CPU designed since they saw Zen 1. This is their Zen 1 moment and they already took the performance crown back basically across the board and at a lower price. AMD is now on the back foot, and it will be another whole year before Zen 4, and the thing is, Zen 4 isnt even competing with Alder Lake, Raptor Lake is rumored to be out before Zen 4. AMD has really screwed up with their launch cycle and given Intel so much room that they not only caught back up but beat them. Intel is truly back.
For now Threadripper has the performance crown. With this performance per watt, Intel can just win the market for PCs. Enterprise will never accept this performance per watt! So, AMD wins the high profitable enterprise market. 12900k guzzles power up to 241! whereas 5950x consumes half!
Considering power consumption, it's like a Pyrrhic victory for Intel.
The HEDT market in Enterprise is workstations, which run certified apps like AutoCAD and has a lot of inertia. The first real Zen workstation is the Lenovo P620 and it only recently came out, so AMD hasn't conquered that market yet. Most actual Enterprise desktops are compact models that typically run on laptop CPUs.
And Intel has AMD beat for miles in system validation. My 3950X on a x570 Phantom Gaming X has major issues with disk access across one NVMe, one SATA SSD, and two HDDs. Some things will start up fine, but some things will just HANG. Deus Ex loading screens take like 10 seconds. I just tried to play a video off my NVMe and it took ~15 seconds for it to launch MPC-HC. (further launches are fine.) MeGUI takes 15 seconds to launch. This thing is just frustratingly slow in general desktop tasks compared to my old i7 4790. Does it beat the pants off the 4790 in heavily multithreaded crunching? Yes. But iAMD does not put out a quality product.
anecdotal evidence? ....YOU have issues with your system. well we have 16 core ryzen and threadripper 32 & 64 core systems at work and we can´t complain. it´s not as if intel is issue free (and i am not taking about security flaws).
when you have such grave issues.. YOUR system has issues. probably a bad setup. i did not hear that starting MPC needs 15 seconds when i read abourt AMD systems.
It is your problem not AMD nor Intel! This is why we always refer to QVL of MB before buying RAM, SSD, etc. to avoid such problems. It is not AMD prerogative rather it is for all platforms. For now you may better update MB bios as soon as it is released. To solve the problem completely you need to reassemble it according to the MB's QVL.
I agree. I've got a 3995wx everything on qvl, even with an optane drive. Got too annoyed with the bugs and found a 5950x worked better for a high performance desktop. Going to swap to a 12900k once i can find parts.
If you know how to use mem timings, you idiots that depend on SPD's wouldn't have these problems (that covers about 90% of this crap, and knowing other bios settings solves almost anything else besides REAL failures). I've been building systems for decades (and owned a PC biz for 8yrs myself) and a MB's QVL list was barely used by anyone I know (perhaps to look up some ODD part but otherwise...Just not enough covered at launch etc). If I waited for my fav stuff to be included in each list I'd never build. Just buy top parts and you don't worry much about this crap.
That said, if my job was on the line, I'd check the list, but not because I was worried about ever being wrong...LOL. I just don't have a liars face. I'd be laughing about how stupid I think it is after so many builds and seeing so many "incompatible memory" fixed in seconds in the hands of someone not afraid to disable the SPD and get to work (or hook up with a strap before blowing gigs of modules, nics repeatedly etc). Even mixing modules means nothing then (again, maybe if I was pitching servers...DUH....1 error can be millions) after just trying to make issues exists with mixing/matching but with timings CORRECT. No, they will work, if set correct barring some REAL electrical issue (like a PSU model from brand X frying a particular model mboard - say dozens in a weekend, a few myself!).
Too many DIY people out that that really have no business building a PC. No idea what ESD is (no just because it took a hit and still works doesn't mean it isn't damaged), A+ what?? Training? Pfft, it's just some screws and slots...Whatever...Said the guy with machine after machine that have never quite worked right...LOL. If you live in SF or some wet joint OK (leo leporte etc? still around), otherwise, just buy a dell/hp and call it a day. They exist because most of you are incapable of doing the job correctly, or god forbid troubleshooting ANYTHING that doesn't just WORK OOB.
Note Intel doesn't allow "dog sh1t motherboards" to happen, especially at the $300+ price point. That makes it an AMD issue. I can refurb Dell after Dell after Dell after Dell, all of them on low-end chipsets and still on the release BIOS, and they all work fabulously. Meanwhile two years into x570 and AMD is still working on getting USB working right.
I think I'll put this thing on the market and see if I can recoup the better part of an i9 12900k build. I may have to drop down to one of the i7 6700's or the i7 4770k system I have until they're in stock, but that's really no issue.
It's a pleasure to not have p*gheaded amateurs in the AMD zone. Others are telling you it's not AMD issue but you spamming it's AMD, AMD, AMD... having got the wrong and of the stick.
@Netmsm Regardless of whether the blame lies with ASRock for the above issue, it remains a fact that AMD didn't fix a USB connectivity problem in Zen 3 until 6-7 months after initial availability. Partly that was because the installed base of guinea pigs was constricted by limited product, but it goes to show that quick and widespread product rollouts have a better chance of ironing out the kinks. (Source if you've been under a rock heh https://www.anandtech.com/show/16554/amd-set-to-ro...
And then recently we had Windows 11 performance regressions with Zen 3 cache and sandboxed security. These user experience hiccups suggest one company perceptibly lags the other in platform support. It's just something I've noticed switching between Intel and AMD. I might think this all to be normal were I loyal to one platform.
I didn't realize we're here to discuss minor issues/incompatibilities of the Intel's rival. I thought we're here to talk about major inefficiencies besides improvements of Intel's new architecture. Sorry!
@Netmsm That's no minor issue/incompatibility. Maybe for you, but a USB dropout is not trivial! Think missing keystrokes, stuttering audio for USB headsets and capture cards. It didn't affect every user, and was intermittent, which was part of the difficulty. I put off a Ryzen 5000 purchase for 2 months waiting for them to fix it. (I also put it off for 4 months before that because of lack of stock lol.)
@Wrs Assume we give you this, just drop AMD related things. Let's focus on performance per watt of new Intel Architecture. I asked some questions about this which is a pertinent topic to this article but you dodged. Just talk about 241 watt being sucked by 12900k :))
When Intel doesn't have this issue it is AMD. Intel demands validation and spends hundreds of millions on engineers to help motherboard manufacturers fix their issues before release because they realize that the motherboard is an INSEPERABLE part of the equation. You can fanboi all you want over the CPU, but if there are no good motherboards you have a terrible system. AMD just dumps their chip on the market, leaving the end-user with a minefield of motherboard issues among every manufacturer. When Intel has the problem solved but AMD doesn't, it's an AMD issue.
DominionSeraph have you tried this on a different x570 board ? it COULD be just that board, as was already stated. i have been running a 5900x now for about a week, and no issues at all with disk access, its the same as the 3900x i replaced, and im using an asus x570 e-gaming board. maybe take a step back, realize that its NOT an amd issue, but a BOARD issue. to be fair, intel has had its own issues over the years as well, while you see them as perfect, sorry to break it to you, but intel also isnt perfect
Oh I got it, the fact that you're encountering unusual issues, despite myriads of people who haven't complained of the same problem you have, means that AMD got many incompatibility issues. That's a brilliant reasoning which brings us to a pure conclusion: AMD is bad. We're done here.
@DominionSeraph "Note Intel doesn't allow "dog sh1t motherboards" to happen, especially at the $300+ price point. That makes it an AMD issue."
You obviously haven't looked at or experienced the issues that plague many motherboards. Maybe just read some reviews or visit manufactures forums to get an idea of how bad it is before making such a naïve statement. It has more to do with the manufacturer of the board than the CPU. That is like saying nvidia doesn't make crap video cards. Well you need to look at all the failures by each manufacturer, not nvidia itself.
Most desktops at enterprise companies could be replaced with terminals given that most of the people are really just performing data entry & retrieval. The network is the bit doing the work. For people who need old school workstations, then I agree, but that's a damn small (but high margin) market.
Scroll down and you'll find a graph detailing total gaming power consumption (CPU + GPU) and CPU power consumed per fps. In both metrics, Alder Lake is doing better than Zen 3 and much better than Rocket Lake.
It seems like Alder Lake for desktop has been clocked way beyond its performance/watt sweet spot. It should be very interesting to compare Alder Lake for laptops v/s Zen 3 for laptops.
To give a short summary for (only) CPU power consumption v/s FPS when playing Horizon Zero Dawn
11900K consumes 100 watts for 143 fps 5950X consumes 95 watts for 145 fps 5800X consumes 59 watts for 144 fps 12900K consumes 52 watts for 146 fps 12700K consume 43 (!) watts for 145 fps
Intel is very, very competent with AMD. Considering that 12700K has less E cores and consumes less power, I am very curious how it would do with all E cores disabled and running only on P cores.
Sounds like there is only gaming world! In PCs it may not be considered as a egregious blunder however you're right Intel is now competitive but to previous AMD's if and only if we wink at Intel's guzzling power.
Some examples from Tom's benches: y-cruncher 12900k DDR5 consumes 197 watts whereas 5950x consumes 103 watts.
My 5950X uses 130-140W in y-cruncher. And @TweakPC on twitter tested lower PL1 and found the 12900k was only around 5% slower using 150W than 218W. Alderlake being power hungry is only because Intel is pushing 8 P-cores and 8 E-cores (collectively equal to around 4 P-cores according to Intel) to the limit, to compete against 16 Zen 3 cores. You can argue that it's still not as good as the 5950X but efficiency in this case is purely a problem of how much power Intel is allowing by default
Even Ian has "accidentally" forgotten to put nominal TDP for 12900k in results =)) All CPUs in "CUP Benchmark Performance: Intel vs AMD" are mentioned with their nominal TDP except 12900k. It sounds there's some recommendations! How venal!
It's not just the gaming world -- it's the entire world except for long-running CPU intensive tasks. Handbrake and blender are valuable benchmarking tools for seeing what a CPU is capable of when pushed to the limit, but the vast majority of users -- even most power users -- don't do that.
Sure, Intel has more work to do to improve power efficiency in long running CPU intensive workloads, but taking the worst case power usage scenarios distorts the picture as much as you're claiming the reviewers are doing.
Can't calculate efficiency without scores. Also, well known that power scales much faster than performance. The proper way to compare efficiency is really at constant work rate or constant power.
Sorry, this is a direct link to Tom's bench: https://cdn.mos.cms.futurecdn.net/if3Lox9ZJBRxjbhr... this is for "blender bmw27" in which both 12900k and 5950x finish the job around 80 seconds BUT 12900k sucks power for about 70 percent more than 5950x.
I'm wondering why Ian hasn't put 12900k nominal TDP in results just like all other CPU's! When 10900k was released with nominal TDP of 125, Ian put than number in every bench while in reality 10900k was consuming up to 254 (according to the Ian's review)! When I asked him to put real numbers of power consumption for every test he said I can't because of time and because I've too much to do and because I've no money to pay and delegate such works to an assistant! But now we have 12900k with nominal TDP of 241 which seems unpleasant to Ian to put it in front of it in results.
1 billion computing devices and just a few million game units sold? What does it mean? Gamers are a tiny but vocal minority. If they bring this performance at 5W on low and 45W on high then its good for majority of people. This is just a space heater.
so throwing more cores on a game that can´t make use of them is usless thanks for clarifing that.... genius!!
when a 5600x is producing 144 FPS and a 5950x is producing 150 FPS the 5600x is the clear winner when it comes to efficency.
now try to cool the 12900K in a work environment with an air cooler. i can cool my threadripper with a noctua aircooler and let it run under full load for ours.
i am really curious to see how the 12900k will handle that.
i am not an amd fanboy. i was using anti-consumer intel for a decade before switching to ryzen. i would us intel again when it makes sense for me (i need my pc for work not gaming).
The 12900k is fine with a Noctua D15 in a work environment. Doesn't matter if you're hammering it at 95C the whole time, the D15 doesn't get louder. But it's no megachip like a Threadripper. For that on the Intel side you'd wait for Sapphire Rapids or put up with an existing Xeon Gold with 8-32 Ice Lake cores at 10nm.
@Netmsm I'll leave that to the market as I don't foresee using any of the 3 that soon lol. It would stand to reason that if one product is both cheaper and better, it would keep gaining share at the expense of the other. If that doesn't happen I would question the premise of cheaper + better. And seeing as it's a major market for Intel, I have little doubt they'll adjust prices if they do find themselves selling an inferior product.
That's right. We always check performance per watt and per dollar. A product should be reasonable with respect to its price and power consumption, this is a must.
12900k can consume up to 241 which is very closer to Threadripper not Ryzen 5900's TDP and yet competing with chips having 125 TDP! What a parody this is!
I can't disregard and throw away efficiency factor, that's all.
Seeing this has made me very interested to see the value proposition Alder Lake will be offering in gaming notebooks. I was vaguely planning to switch up to a Zen 3+ offering for my next system, but this might be enough to make me reconsider.
<blockquote>re: Enterprise: Considering power consumption, it's like a Pyrrhic victory for Intel.</blockquote> Why? This is not an enterprise solution -- that's the upcoming Sapphire Rapids Xeon processors, a completely different CPU platform.
Sure, if all you're doing is pegging desktop CPUs at 100% for video processing or a similar workload, then Alder Lake isn't for you, but the gaming benchmarks clearly show that when it comes to more typical desktop workloads, the i9 12900k is inline with the top of the line AMD processors in terms of power consumption.
And that, my friend, is a great example of moving the goalposts.
We'll have to see what Intel offers re: Xeon's but one thing is for sure, they're going to offer a completely different power profile to their flagship desktop CPUs, because that's the nature of the datacenter business.
Of course the nature of enterprise won't accept this power consumption. In PC world customers may not care how ineffective a processor is. Intel will reduce the power consumption but the matter is how its processor will accomplish the job! We see an unacceptable performance to watt in Intel's new architecture that needs something like a miracle for Xeon's to become competitive with Epyc's.
No miracle is needed... just go down the frequency-voltage curve. Existing Ice Lake Xeons already do that. What's new about Sapphire Rapids is not so much the process tech (it's still 10nm) but the much larger silicon area enabled per package due to the EMIB packaging. That's their plan to be competitive with Epyc and its multichip modules.
And what will happen to performance as frequency-voltage curve goes down? Just look at facts! With about 100w more power consumption Intel's new architecture gets itself in front of Zen 3 by a slight margin in some cases that lucidly tells us it can never reduce power consumption and yet beat Epyc in performance.
@Netmsm I'm looking at facts. The process nodes are very similar. One side has both a bigger/wider core (Golden Cove) and a really small core (Gracemont). The other side just has the intermediate size core (Zen 3). As a result, on some benchmarks one side wins by a fair bit, and on other benchmarks, the other side takes the cake. Many benches are a tossup.
In this case the side that theoretically wins on efficiency at iso-throughput (MC performance) is the side that devotes more total silicon to the cores & cache. When comparing a 12900k to a 5950x, the latter has slightly more area across the CCDs, about 140 mm2 versus around 120 mm2. The side that's more efficient at iso-latency (ST/lightly threaded) is the one that devotes more silicon to their largest/preferred cores, which obviously here is ADL. In practice companies don't release their designs at iso-performance, and for throughput benchmarks one may encounter memory and other platform bottlenecks. But Intel seems to have aggressively clocked Golden Cove such that it's impossible for AMD to reach iso-latency with Zen 3 no matter the power input (i.e., you'd have to downclock the ADL). That has significant end-user implications as not everything can be split into more threads.
The Epyc Rome SKUs are already downclocked relative to Vermeer, like most server/workstation CPUs. Epyc Rome tops out at 64 Zen3 cores across 8 chiplets. Sapphire Rapids, which isn't out yet, has engineering samples topping out at 80 Golden Cove cores across 4 ~400mm2 chiplets. Given what we know about relative core sizes, which side is devoting more silicon to cores? There's your answer to performance at iso-efficiency. That's not to say it's fair to compare a product a year out vs. one you can obtain now, but also I don't see a Zen4 or N5 AMD server CPU within the next year.
I believe, we're not talking about ISO-efficiency or manufacturing or engineering details as facts! These are facts but in the appropriate discussion. Here, we have results. These results are produced by all those technological efforts. In fact, those theoretical improvements are getting concluded in these pragmatical information. Therefore, we should NOT wink at performance per watt in RESULTS - not ISO-related matters.
So, the fact, my friend, is Intel new architecture does tend to suck 70-80 percent more power and give 50-60 percent more heat. Just by overclocking 100MHz 12900k jumps from ~80-85 to 100 degrees centigrade while consuming ~300 watts.
Once in past, AMD tried to get ahead of Nvidia by 6990 in performance because they coveted the most powerful graphic card title. AMD made the hottest and the noisiest graphic card in the history and now Intel is mimicking :)) One can argue that it is natural when you cannot stop or catch a rival so try to do some chicaneries. As it is very clear that Anandtech deliberately does not tend to put even the nominal TDP of Intel 12900k in their benches. I loathe this iniquitous practice!
@Netmsm I believe the mistake is construing performance-per-watt (PPW) of a consumer chip as indicative of PPW for a future server chip based on the same core. Consumer chips are typically optimized for performance-per-area (PPA) because consumers want snappiness and they are afraid of high purchase costs while simultaneously caring much less than datacenters about cost of electricity.
@Wrs You cannot totally separate efficiency of consumer and enterprise chips! As an incontrovertible fact, architecture is what primarily (not completely) determines the efficacy of a processor. Is Intel going to kit out upcoming server CPUs in an improved architecture?
@Netmsm Architecture, process, and configuration all can heavily impact efficiency/PPW. I’m not aware of any architectural reason that Golden Cove would be much less efficient. It’s a mildly larger core, but it doesn’t have outrageous pipelining or execution imbalances. It derives from a lineage of reasonably efficient cores, and they had to be as they remained on aging 14nm. Processwise Intel 7 isn’t much less efficient than TSMC N7, either. (It could even be more efficient, but analysis hasn’t been precise enough to tell.) But clearly ADL in a 12900/12700k is set up to be inefficient yet performant at high load by virtue of high frequency/voltage scaling and thermal density. I could do almost the same on a dual CCD Ryzen, before running into AM4 socket limits. That’s obviously not how either company approaches server chips.
They've been selling every CPU they can make. There are shortages of every Zen 3 based notebook out there (to the extent that some OEMs have cancelled certain models) and they're selling so many products based on the desktop chiplets that Threadripper 5000 simply isn't a thing. You ought to factor that into your assessment of how they're doing.
Is anyone gullible enough to forget more than a decade of price gouging, low core counts and nearly nonexistent performance increases we got from Intel, vs. the high core counts, increasing performance, and lower prices we got from AMD?
And you are ready! to convince everybody... that whole freaking plandemic & communists mafia had nothing to do with prices gone up across the board. Good man!
zzzoom, so in other words, intel kept raising its prices when they had the lead, but its NOT ok for amd to raise its prices when they have the lead ? so who is gullible ? amd had the right to raise its prices, after all intel did it.
This is a market economy. Neither company cares about your emotional attachments or misgivings beyond what is profitable for them. AMD as the market underdog played up that position heavily, gaining significant goodwill with the enthusiast consumer market. However as Zzzoom mentioned just as is expected as soon as they retook the performance dominant position their aggressive pricing strategy evaporated.
If you're going to criticize Intel's market stagnation via mismangement for a decade you can't just ignore the fiasco of AMD's awful Bulldozer architecture and the 4.5 year gap between the launch of Piledriver and the launch of Zen 1. It's not unreasonable to make the argument that because Intel absolutely needed AMD to remain around at that time to avoid facing anti-trust issues, the lack of any real competitive alternative is a factor in their decision to stagnate as just 'greed'.
FX series was as bad as it was for a couple of reasons - partly because AMD were starved of funding during the entire Athlon 64 era, and partly because Global Foundries utterly failed to develop their fabrication processes to be suitable for high-performance CPUs.
Nah, they just weren't that competitive. Athlon64 was decent (lot of credit to Jim Keller) but didn't let AMD take massive advantage of Intel's weakness during the Pentium 4 era because AMD fabs were capacity limited. Once Conroe came out mid 2006 the margins dried up rapidly and AMD had no good response and suffered a talent exodus. It's true Intel made it worse with exclusivity bonuses, but I think AMD's spiral toward selling their fabs would have happened anyway. No way they were going to catch up with tick-tock and Intel's wallet.
I've always felt the K10 wasn't aggressive enough, owing to AMD not having factored Conroe into their equations when K10 was designed. Then, like startled folk, they tried to take back the lead by a drastic departure in the form of Bulldozer; and that, as we know, sank them into the ditch. Nonetheless, I'm glad they went through the pain of Bulldozer: Zen wouldn't have been as good otherwise.
> FX series was as bad as it was for a couple of reasons
I thought I also heard they switched from full-custom layout to ASIC flow (maybe for the sake of APUs?). If so, that definitely left some performance on the table.
3D v-cache will be out before Zen 4 and should help close the gap if not regain the overall lead on the high end. The problem for AMD is the competition below the i9 vs R9 realm, where the E cores really pull more than their weight and help the i9 compete with the R9s in multi, but for the i5s and i7s vs their R5 and R7 counterparts, its even-Steven with performance cores, then you have the E cores as the trump card.
If AMD gains an averge of ~10% in gaming FPS with the 3D cache onslaught, that should put them right back near the top...certainly much closer to the 12900K....
12900K is already huge, each performance core is the size of about 4 E cores, going 16C P-Core would probably mean a 70% die size increase, and then you run into core to core communication issues, AMD got around it with infinity fabric but that's why you have the higher latency access between cores in different core complexes and Intel gives a more consistent access time on higher end products. Intel's current cores are mosly ringbus, so travel from one core to the next, getting to 16 doesn't scale well, they used a mesh topology in some Skylake CPU's, that latency was too high and hampered performance badly, you'd run into that same issue with 16C. That's without checking into yield, getting 16C on one wafer that are all perfectly clocking high is going to be a very, very rare chip; AMD gets around it using the core complexes (CX) of 4 cores each, together into a CCD (core chiplet die) and then in Zen 3 (5000 series) is supposedly 8C CCX, which makes rare chips 8C if full ccx works well, else 6C if 2 can't make it turns into a 5600X.
"at a lower price" Not really, if you take platform into account (and you have to!)
"Zen 4 isnt even competing with Alder Lake, Raptor Lake is rumored to be out before Zen 4" Potentially, but Zen 4 is a bigger jump from Zen 3 than Raptor is predicted to be from Alder. Raptor will have more E cores but it's on the same process, so it's likely to offer better perf/watt in multithreading but unlikely to increase overall performance substantially (unless they allow maximum power draw to increase).
"AMD has really screwed up with their launch cycle" Not really? They're still competitive in both price/performance (accounting for platform cost) and perf/watt. Zen 3D should shore up that position well enough.
Seems we're actually getting a Zen 3 refresh early next year. Alder Lake's lead also decreases with DDR4, gaming above 1080p (so basically anyone who would buy a 12900K for a gaming rig), it uses more power and with DDR5 you pay extra for memory.
Yeah, Alder Lake has some advantages. Not sure I'd call it a better overall package at the moment.
They really haven't screwed up as you would like to think. I do believe AMD was thrown off some by the unexpected performance in Hybrid design. They still do trade blows between some games, multi-threaded software, and on applications that are just not optimized for Alder Lake.
What I have noticed though in the days since Alder Lake's NDA went up and reviews came out, is leaks to AMD's next gen Zen CPUs have begun to trinkle out a little more than usual. Yes we have Zen 4 on the way, which will pave the way for DDR5 and PCIe Gen5 along with an uplift in IPC. However the real secret sauce may be in Zen 4D as the platform to build a heavily multi-threaded core package along with SMT enabled, and then Zen 5. The big picture, is AMD's version of a Hybrid CPU may include a combination of Zen 4D big cores and Zen 5 Bigger cores. The Zen 4D are said to possibly carry as many as 16 cores per chiplet, too, so it would speak to a possible heavily multi-threaded efficient CPU, while sacrificing a little bit of single threaded performance to achieve it. The timeframe would also put the new Hybrid CPU on a collision course to battle Raptor Lake.
For once the CPU market has gotten interesting again, and the consumer ultimately wins here.
Since AVX-512 is working on ADL, it would be useful to test the AVX-512 vs AVX2 power consumption of ADL by running POVRAY using P-cores only and compare that maximum AVX2 power consumption to AVX-512 max power consumption using 3DPM.
Because max 272W power consumption of POVRAY as reported, includes 48W from E-cores too.
I'd actually prefer a all-Gracemont CPU for Laptops. Seems like it would be better for intentionally maximizing battery life. Skylake+ level performance is perfect for most use cases.
I'm guessing you missed the articles describing the two separate mobile dies for Alder Lake? We've got Alder Lake-P (6P + 8E) for performance mobile designs, and Alder Lake-M (2P + 8E) for the ultra mobile low power SKUs.
I'm very happy with exactly-Skylake-level performance in my desktop :). I'd more than gladly take the same performance and cut the power in half. I'm sure there's quite a big market for that kind of performance in a lower powered package regardless of form factor (mobile, desktop).
That's actually not great in power terms compared to what AMD can do with 8 Zen 3 cores on TSMC N7 - but yeah, in the context of die area, something built around (say) 2 P cores and 4 E cores can probably put in a very good showing for inexpensive devices.
PCIe 5.0 is currently useless to consumers and likely to be so for the duration of ADL's life. DDR5 is currently far more expensive and doesn't provide a compelling performance benefit for most users.
So, yeah - just as Comet Lake was a reasonable alternative to Zen 2 and 3 for many users despite being stuck on PCIe 3.0, so ADL doesn't really make a compelling argument for itself just by having PCIe 5.0.
Good performance with DDR5 but this review is less than complete (after you started "intel vs AMD ddr4 was not used for the tests) that's odd because DRAM price and MB makes the pick..it's sad that an important review with too much effort (now more than 17000 words)looks only on the shallow.. is windows 10 used for gaming tests or 11 and why ddr4 is out so the not good enough 2080ti
Der8auer had a video at... Kleindeck i think? Where they analyzed the transistors of Intel 10nm vs AMD 7nm processor. Essentially they are almost equal.
N7 is a little more dense than Intel's 10nm-class process - 15-20% in comparable product lines (e.g. Renoir vs. Ice Lake, Lakefield vs. Zen 3 compute chiplet). There is no indication that Intel 7 is more dense than previous iterations of 10nm. N7 also appears to have better power characteristics.
It's difficult to tell, though, because Intel are pushing much harder on clock speeds than AMD and have a wider core design, both of which would increase power draw even on an identical process.
I’m a little surprised by the low level of attention to performance/watt in this review. ArsTechnica gave a bit more info in that regard, and Alder Lake looks terrible on performance/watt.
If Intel had achieved this performance with similar efficiency to AMD I would have bought Intel stock today.
But the efficiency numbers here are truly awful. I can see why this is being released as an enthusiast desktop processor -- that's the market where performance/watt matters least. In the mobile and data center markets (ie, the Big markets), these efficiency numbers are deal breakers. AMD appears to have nothing to fear from Intel in the markets that matter most.
Yeah, the power consumption of 12900K is quite bad. From other reviews, it's pretty clear that highest end air cooling is not enough for 12900K, and you will need a thick 280mm or 360mm water cooler to keep 12900K cool.
I think there are some issues with temperature readings on ADL. A lot of software showcases 100C with only 3 P-cores loaded, but even with all cores loaded, the CPU doesn't de-clock at that temp. My MSI AIO has a temperature display, and it only showed 75C at load. I've got questions out in a few places - I think Intel switched some of the thermal monitoring stuff inside and people are polling the wrong things. Other press are showing 100C quite easily too. I'm asking MSI how their AIO had 75C at load, but I'm still waiting on an answer. An ASUS rep said that 75-80C should be normal under load. So why everything is saying 100C I have no idea.
They also show the 5900x somehow drawing more power than a 5950x at full load. While I'm sure Intel is drawing more power, I question their testing methods given we know there is very little chance of a 5950x fully loaded drawing less than a 5900x unless they won or lost the CPU lottery.
techspot and TPU also show that, and it has been explained before that the 5950x gets the premium dies and runs at a lower core voltage then the 5900x, thus it pulls less power despite having more cores.
"ArsTechnica gave a bit more info in that regard, and Alder Lake looks terrible on performance/watt."
I was suspicious that this is the reason Intel finally went hybrid on mainstream. Golden Cove can have horrible perf/watt since Gracemont exists for low power scenarios.
Well, you at least have to appreciate that Maxiking saved significant time and effort by typing "2k21" instead of "2021". Attention to efficiency is something we can all respect and admire in MMXXI.
[Intel 12th gen consumes less power in gaming across the board vs Ryzen 5000 series](https://www.reddit.com/r/intel/comments/qmw9fl/why... [Even the Multi threaded perf per watt is also better for 12900K compared to 5900X](https://twitter.com/capframex/status/1456244849477... It is only specific cases where 12900k need to beat 5950x in multi threaded loads it needs to crank up more power. But for typical users Intel is both the perf /watt and perf /dollar champion.
Until you look at the gaming power consumption and realize Intel is beating AMD in efficiency in games and general use. Check igorslab's review. Its only in the highly threaded workstation applications like blender or synthetics that use 100% of load that Intel starts using quite a bit of power. But 99% of users will never do those, all they care about is gaming, browsing, video, etc.
So if 99% of users don’t need multiple cores, I guess intel made a huge mistake in including them. They could have just made a dual core processor and “99%” of users would have been just fine.
I think it’s HILARIOUS that people are arguing that the efficiency of this thing is just fine so long as you don’t actually fully utilize it.
You mean like how I can drive my Civic in a sane manner and get 40mpg or hammer it and get 20mpg? Push the CPU (or automobile) out of it's efficient zone and it becomes less efficient. You can do the same thing with Zen 3 CPU's. They get a little faster and use a lot more power. Same as Intel CPU's.
The having lots of potential power and high power consuption is exactly what mobile phones and laptop cpu's do. That Intel do that in desktops too is not surprising.
99% of users don't need a 12900K. Presumably the people who do are likely to use it for these tasks where it will actually show a performance improvement over a cheaper CPU (accepting that some people overspend for e-peen reasons and will buy one for gaming where a 12600K would do just as well).
99.9999999999% of users don't need a 12900K peak performance constantly, even if they will use the peak performance sometimes, including times when it definitely counts.
I won't lie and say I have the best of the best, but Zen 2 vs Zen 1 cut down my build times noticeably. That helps keep me in flow, even if it's only saving me a few minutes per day. For people like me with ADHD or other attention-related issues, this can be a massive boon.
Does efficiency really matter for top end desktop SKUs? Intel/AMD tend to clock these near their voltage walls, WAY outside the "sweet spot" of a given architecture, and you can get a boatload of efficiency back just dropping boost clocks by 10% yourself.
Now, if the laptop SKUs end up being power hungry, thats a different story.
Same core design, same process. So.... I'm sure Intel will lower clocks for mobile and servers to get power usage down, but once they lower the clocks, how will the performance compare?
For now, efficiency doesn't matter for desktops, but in a few years time, we are very likely to see laws passed that will mandate high efficiency in high end desktops.
There are already some legislation in the works that calls for exactly this, but have not been passed yet.
And how, pray tell, are they going to legislate that? Max power usage for a CPU? We've already seen how california tried it, and predictably they made a mess of it.
INB4 intel just refuses to sell anything but a celeront o californians and mysteriously tech resellers in arizona get a bunch of cali orders. Hmmmm.....
don't ask me, IDK how law makers will do it. Just be aware that there are some really dumb laws that are already in existence, and the world is going to be entering an age of power shortages, along with carbon neutral incentives.
Considering how things are going currently, I think it'll just be a 100% tax on desktop CPUs that can't hit some efficiency metric that Apple has designed.
Doubtful given how poorly the existing law works. All they do is measure computer idle wattage. The lawmakers aren't techies. And they're busy handling the blowback from carbon neutrality bills that the pubic believes are related to power shortages and cost spikes.
Mobile parts will have cores and clocks slashed to hit mobile power levels; 7W-45W with 2p2e - 6p8e
However, given that a single P core in the desktop variant can burn 78W in POV Ray, and they want 6 of them in a mobile part under 45W, that means a lot of restrictions apply.
Even 8 E cores, per this review, clock in at 48W!
That suggests a 6p8e part can't be anywhere near the desktop part's 5.2GHz/3.9GHz Turbo clocks. If there is a linear power-clock relationship (no change in voltage) then 8 E cores at 3GHz will be the norm. 6 P cores on POV-Ray burn 197W, then to hit 45W would mean throttling all 6 cores to 1.2GHz
Except that we know that the power-clock ratio is not linear and never has been. You can drop a few hundred MHz off of any Intel chip for the past 5 generations and get a much better performance per watt ratio. This is why mobile chips don't lose a lot of MHz compared to desktop chips.
Suggests that a 4pNe part might be similar while the 6p8e part would probably be a 2.3GHz part that could turbo up to a single core to 4GHz or all cores to 3.6GHz
Yes, once it gets in the way of performance, and intel's horrible efficiency means you need high end water cooling to keep it running, whereas AMD does not. Intel's inneficiency is going to be an issue for those who like air cooling, which is a lot of the market.
Trouble is I'm not seeing "horrible efficiency" in these benchmarks. The 12900k is merely pushed far up the curve in some of these benches - if the Zen3 parts could be pushed that far up, efficiency would likewise drop quite a bit faster than performance goes up. Some people already do that. PBO on the 5900x does up to about 220W (varies on the cooler).
"if the Zen3 parts could be pushed that far up" But you wouldn't, because you'd get barely any more performance for increased power draw. This is a decision Intel made for the default shipping configuration and it needs to be acknowledged as such.
As a typical purchaser of K chips the default shipping configuration holds rather little weight. A single BIOS switch (PBO on AMD, MTP on Intel), or one slight change to Windows power settings, is pretty much all the efficiency difference between 5950x and 12900k. It pains me every time I see a reviewer or reader fail to realize that. The chips trade blows on the various benches because they're so similar in efficiency, yet each by their design has strong advantages in certain commonplace scenarios.
If the competition are able to offer similar performance and you don't have to shell out the cash and space for a 360mm AIO to get it, that's a relevant advantage. If those things don't bother you then it's fine, though - but we're in a situation where AMD's best is much more power efficient than Intel's at full load, albeit Intel appears to reverse that at lower loads.
Clock/power scales geometrically. The 5900HS retains ~85% of the 5800X's performance while using 35-40W stable power vs 110-120W for the 5800X. That's almost 3x more efficient. Intel is clocking desktop ADL to the moon, it doesn't mean ADL is going to scale down poorly, if anything I expect it to scale down very well since the E-cores are very performant while using a fraction of the power and according to Intel can operate at lower voltages than the P-cores can, so they can scale down even lower than big cores like ADL P-cores and zen 3. ADL mobile should be way more interesting than ADL desktop.
Clock/power scales linearly. It's only Voltage/power that scales geometrically
If you have to increase voltage to increase clock then you can say clock/power is geometric.
So if at a fixed voltage you can go from 2GHz -> 2.5GHz the power usage only goes up by 25%
If you also bump voltage up, however, from 1.2V -> 1.4V, power usage might go up 36%, so that combined you would see a 61% increase in power usage to hit 2.5GHz.
Alder is actually more efficient than 5950X in most real world scenarios. PC World did a proper real world power consumption comparison and Alder Lake was superior in most cases. Unless AMD cuts prices dramatically, in makes zero sense to buy Ryzen at this moment ... unless you are a mindless fanboi that is!!
"Unless AMD cuts prices dramatically, in makes zero sense to buy Ryzen at this moment ... unless you are a mindless fanboi that is!!" Or if, you know, you pay attention to how much a whole system costs and make a decision based on that instead of assuming cheap CPU = cheap system?
Mindless??!! Why?? Cause I can buy a Ryzen 5000 cpu to drop into my current motherboard to replace my Zen+ cpu (2700x). Mindless cause I refuse to pay $750 for 12900k, $450+ for Asus mb not even the best middle of the road, $330 for 32gb of ddr5 and this doesn't even include the aio360 cooler needed. You do the math.
What puzzles me is why you haven't already dropped a Zen 3 in that. Zen 3 is basically end of the line for that board. I do not know if "Zen 3+" with vertical cache will even come out, much less be available and affordable for someone who shuns ddr5 costs.
My argument would be anyone looking at performance per watt on a CPU like this is a bit crazy. I've never considered that important for a CPU you're going to run well out of spec with big cooling anyway.
I'm far more interested in perf per watt on the mobile version. That's where it's going to matter as you can't just throw more cooling at a laptop. Especially compared to the Ryzen chips that have extremely good perf per watt.
I don't think you have to be crazy, you just have to be one of those few for whom it matters -- i.e. those who execute long-running high-CPU load workloads on a regular basis.
Otherwise, yeah, it's mostly irrelevant, given the performance per watt of more typical workloads -- even gaming -- are pretty much inline with the equivalent Ryzen CPUs.
I'm disappointed by power consumption figures from almost all outlets. Intel usually pushes the i9 parts to the voltage and frequency wall, meaning the power consumption would obviously be bad at max possible clocks.
I'm not saying I necessarily trust this source, but I don't see a reason to think it's fake. You can have ~92% of the performance for ~68% of the power.
That's not what I meant. I wanted to see tests at different power limits to see how it does in performance/watt metrics against zen 3 at the same limits.
Apple rivals desktop CPUs in SPECINT, which clearly loves memory bandwidth and cache. DDR5 alone boosted ADL's score in SPECINT MT by 33% from a DDR4 configuration. In Cinebench and geekbench the m1 pro and max are closer to workstation laptop processors. We'll see what happens with ADL mobile.
The 12600K has basically the same Geekbench score of the M1 Max, and yet its 10 cores consume 3 times more than the M1. On the 12900K just using the 8 E-cores consumes more than the M1 Max using the CPU at peak power. So we shouldn’t expect big miracles in mobile, unless Intel starts selling 90W chips. As for Cinebench, it will be difficult for Apple Silicon to come out on top until Apple implements some sort of Hyperthreading, Cinebench takes good advantage from it.
The H55 segment will offer 8+8 at 45W and H45 will offer 6+8 at 35W, no need to compare the 12600k. We have models for how mobile uses power compared to desktop. They retain 80-90% of the performance at 1/3 to 1/4 the sustained power. 5900HS @ 35W cTDP (35-40W actual power) has around 85% the performance of the 5800X @110-120W in cinebench. The 11980HK at 45W has almost 90% the performance of the 11700k at 130-150W (non-AVX) in geekbench 5.
Closer to 15% drop in Geekbench, and probably at much higher package peak power draw than 45W, considering what Anandtech has measured for the 11980HK in Multithreaded tasks (around 75W).
The 11980HK respects the configured PL/cTDP for the most part. It only hits 75W during the initial cold start. It uses 65W sustained power when configured to PL 65 and 45W when configured to 45W https://www.anandtech.com/show/16680/tiger-lake-h-... I screwed up using tom's results for geekbench, apparently it is at PL 65 unlike Anand's for the TGL test system. But it also scores 9254 vs anandtech's 11700k scoring 9853, so within around 94% performance of its desktop counterpart. I've seen some higher scores on GB itself but using "official" sources that's pretty close to 2x more efficient. I can't seem to find any real PL 45 results for GB5. Point is, scaling down isn't a problem, and ADL will no doubt scale down better thanks to E-cores and just overall better efficiency based on what we've already seen, like gaming efficiency according to igorslab and PL 150 making barely any difference in performance compared to PL 220. I think Intel is in a unique position since AMD doesn't have small cores anymore.
What you are failing to realize is that Geekbench, due to its short tests nature, ends up spending a lot of time at peak performance and not at sustained performance. And no, the 11700k doesn’t score 9853 - you are looking at averages on the Geekbench site which are not reliable to make this sort of comparison. Notebookcheck geekbench score is close to 11300, while the 11980HK scores closer to 9700.
Geekbench runs for a few minutes afaik. The peak you're describing only lasts for a split second and quickly falls down to the sustained power over a few seconds to 30 seconds. And no, I'm not looking at averages from geekbench, I literally told you I'm using anand's score for the 11700k and tom's score for TGL mobile. https://www.anandtech.com/bench/CPU-2020/2972
geoxile, Geekbench is a bunch of discreet tests with pauses in between. The value that you used is almost exactly the average in the Geekbench database, and we know that the 11700 gets much higher than that. You can also check that Anandtech never showed Geekbench results with that CPU in any of its reviews of the 11700. Don’t know why that value is there.
Describing a context switch to load the next bench as "pauses" is borderline gaslighting. It's a memory workload, not idling. PL2 on the 11980HK lasts for seconds from cold start at PL1 45.
It's almost or it's exact. Anandtech lists those scores and I have no reason to doubt they copied them or made them up. Tom's has slightly higher scores at 10253 @ stock. That's a 4% variance, probably due to tom's using DDR4 3600 with tuned timings while anandtech used DDR4 3200. It's only with a 5Ghz OC toms can even break through 11000, let alone score 11300. https://www.tomshardware.com/reviews/intel-core-i7...
It’s not context switch, Geekbench deliberately pauses between tests to avoid throttling. Read about it. Notebookcheck didn’t make them up either and you can see higher scores inside Geekbench database.
11980HK to 11700K isn't a useful analogue - it's 10SF vs. 14++++ and the caches on TGL-H45 are large than RKL.
I'd be comfortable predicting that ADL mobile will be ~15% faster than TGL all-round, with better gains in power-limited multithreading scenarios where the E cores are properly utilised.
That's a valid point. But we can still look to the zen 3 APUs vs the desktop 5800X and see similar or better perf/W scaling. Based on what we've seen so far ADL is very comparable to zen 3 in efficiency in heavy synthetic loads when set to optimal PL (e.g 150W) and far, far more efficient in mixed loads like gaming, where a 12900k with PL 241 uses 60-70% the power of a stock 5950X. These are good signs.
"Apples to Apples", like 8 TGL cores vs 8 mixed ADL cores I'd agree. But the leaked configurations are 8+8 for 45W (up to 55W cTDP), or 6+8 at 35-45W. I think e-cores will make a huge difference.
*raises hand* You can restrict the TDP of any Intel/AMD consumer processor. Or you can raise it, subject to physical/overclocking limits. It's user choice. I never complain when they're giving us choice.
Process node advancement is in the right direction. In terms of efficiency, Intel is one full node behind the leading edge, which Apple basically has exclusively. No other high-volume chip is comparable to N5, even the Snapdragon 888 (though Samsung calls it 5nm).
Huge improvements for Intel, beats Zen 3 soundly in performance almost across the board. The 12900k is a beast. However its the 12600k that is the real champ, half the price of a 5800x and it still beats it.
No surprise that 12th gen is sold out everywhere online, it seems like the Zen 3/AMD era is dead and Intel is back.
I find the argument for disabling AVX512 really not convincing. If a process is running on an E core and reaches nonexisiting instruction it traps into the OS. The OS can determine that it's an instruction that can be executed on a P core and reschedule it there, keeping note to not move that process/thread back to an E core. It shouldn't have been too hard.
It basically comes down to a context-switch. And those take a couple microseconds (i.e. many thousands of CPU cycles), last I checked. And that assumes there's a P-core available to run the thread. If not, you're potentially going to have to wait a few timeslices (often 1 -10 ms).
Now, consider the case of some software that assumes all cores are AVX-512 capable. This would be basically all AVX-512 software written to date, because we've never had a hybrid one, or even the suggestion from Intel that we might need to worry about such a thing. So, the software spawns 1 thread per hyperthread (i.e. 24 threads on the i9-12900K) but can only run 16 of them at any time. That's going to result in a performance slowdown, especially when you account for all the fault-handling and context-switching that happens whenever any of these threads tries to run on an E-core. You'd basically end up thrashing the E-cores, burning a lot of power and getting no real work done on them.
Forgot to address the case where the OS blocks the thread from running on the E-core, again.
So, if we think about how worker threads are used to split up bigger tasks, you really want to have no more worker threads than actual CPU resources that can execute them. You don't want a bunch of worker threads all fighting to run on a smaller number of cores.
So, even the solution of having the OS block those threads from running on the E-cores would yield lower performance than if the the app knew how many AVX-512 capable cores there were and spawned only that many worker threads. However, you have to keep in mind that whether some function uses AVX-512 is not apparent to a software developer. It might even do this dynamically, based on whether AVX-512 is detected, but this detection often happens at startup and then the hardware support is presumed to be invariant. So, it's problematic to dump the problem in the application developer's lap.
Plus, enabling AVX-512 on the big Cores would have meant having it on the E (Gracemont) cores also, or switching workloads from P to E cores on the fly won't "fly". And having AVX-512 in Gracemont would have interfered with the whole idea of Gracemonts being low-power and small footprint on the die. I actually find what Ian and Andrei did here quite interesting: if AVX-512 can really speed up whatever you want to do, disable the Gracemonts and run AL in Cove only. If that could be a supported option with a quick restart, it might be worthwhile under the right circumstances.
There is no relevant AVX-512 state before the first AVX-512 instruction is executed. So trapping and switching to a P-core is entirely doable. Switching back would probably be a bigger problem, but one probably does not want to do that anyway.
Possible problem: how would you account for a scenario where the gain from AVX-512 is smaller than the gain from running additional threads on E cores? Especially when some processors have a greater proportion of E cores to P cores than others. That could get quite complicated.
If you look at the Intel's prerelease presentation about Thread Director carefully, you see they are indeed talking about moving the integer (likely control) sections of AVX threads to E-cores and back as needed.
I'll reply to my comment because it seems the original one was not understood.
When you have an AVX512-using thread on a P thread, it might happen that it needs to be suspended, say, because the CPU is overloaded. Then the whole CPU state is saved to memory so the execution can later be resumed as if nothing has happened. In particular, it may be rescheduled on another core when its time for it run again. If that new core is a P core, then we're safe. But if it's an E core, it might happen that we hit an AVX512 instruction. Obviously, the core cannot execute it so it traps into the OS. The OS can check what was the offending instruction and determine that the problem is not the instruction, but the core. So it moves it back to a P core, stores a flag that this thread should not be rescheduled on an E-core and keeps chugging.
Now, someone suggested that there might be a problem with the CPU state. And, indeed, you can not restore the AVX512 part of the state on an E core. But it cannot get changed by an E core either, because at the first attempt to do it it will trap. So the AVX512 part of the state that was saved on a P core is still correct.
Since this isn't being done, there might be (but not "must be" - intel, like AMD, will only do what is good for them, not what is good for us) some problem. One being that an AVX512 thread will never be rescheduled on an E core even if it executes a single AVX512 instruction. But it's still better than the current situation which postpones the wider adoption of AVX512 yet again. I mean, the transistors are already there!
Let's be real, no one expected anything else, the first time Intel can use a different node that still has its problems and AMD is embarrassed and slower again.
Lol, once Intel starts making their GPUs using TSMC, AMD back to being slowest there too.
Why disparage AMD !? the harder these companies compete, the better for consumers! Stop being a mindless fanboi of any company and start thinking rationally and become a fan of your own pocketbook !
"But but but it wasn't a CONSUMER (read: cheap enough) CPU!!!!!!"
- THE AMDrones who forget that AMD sold a 6 core phenom that lost game benchmarks to a dual core i3 and then spent the next 5 years selling fake octo core chips.
"But but but it wasn't a CONSUMER (read: cheap enough) CPU!!!!!!" Correct, there's a difference between a CPU that needs a consumer-level dual-channel memory platform and a workstation-grade triple-channel (or greater) platform. It doesn't sound so absurd if you don't strawman it with emotive nonsense and/or pretending that Used prices are the same as New.
I don't think anybody's forgotten how much better Intel's CPUs were from Core 2 all the way through to Skylake. The fact remains that when AMD returned to competition, they did so by opening up entirely new performance categories in desktop and notebook chips that Intel had previously restricted to HEDT hardware. I don't really understand why Intel fanbots like Maxiking feel so pressed about it, but they do, and it's kind of amusing to see them externalise it.
too bad that was hedt, not mainstream, thats where your beloved intel stagnated the cpu market, maxipadking, maybe you should cry more. intels non hedt lineup was so confusing i gave up trying to decide which to get, so i just grabbed a 5830k and an asus x99 deluxe, and was done woth it.
The current street prices of $600+ for high-end desktop CPUs are comparable to HEDT prices. Let's face it, the HEDT market is underserved right now as a cost-saving measure (make them spend more for bandwidth) and not because the segmentation was bad.
In sum, my opinion is that ignoring the HEDT line doesn't make much sense because segmenting the market was a good thing for consumers. Most people didn't end up needing the advantages that the platform provided before that era got excised with Windows 11 (unrelated rant not included). That provided a cost savings.
" The current street prices of $600+ for high-end desktop CPUs are comparable to HEDT prices " i didnt say current, as you saw, i mentioned X99, which was what 2014, that came out ?
" Let's face it, the HEDT market is undeserved right now as a cost-saving measure " more like intel cant compete in that space due to ( possibly ) not being able to make a chip that big with acceptable yields so they just havent bothered. maybe that is also why their desktop cpus maxed out at 8 cores before gen 12, as they couldn't make anything bigger. " doesn't make much sense because segmenting the market was a good thing for consumers " oh but it is bad, intel has pretty much always had way to many cpus for sale, i spent i think 2 days trying to decide which cpu to get back then, and it gave me a head ache. i would of been fine with z97 i think it was with the I/O and pce lanes, but trying to figure out which cpu, is where the headache came from, the prices were so close between each that i gave up, spent a little more, and picked up x99 and a 5930k( i said 5830k above, but meant 5930k) and that system lasted me until early last year.
What a pathetic attempt at trolling. Not sure if you noticed but Ryzen CPUs actually win lots of the game benchmarks, ties lots more; and many of the ADL wins are only with the very top CPU with DDR5. In several games even the 5800X beats ADL (even against DDR5). Zen3 is now a full year old, no v-cache yet, the next refresh which is coming soon will probably beat ADL across the board (still without DDR5). Granted, Intel still dominates anything that makes heavy use of AVX-512, which is... almost nothing, you can count'em on one hand's fingers.
Considering the current price of DDR5, even for a brand-new system where you have to buy everything including the RAM, a top-end ADL system is a pretty bad value right now. But thanks to this release the price of Zen3 CPUs is going further down, I can now find a 4900X for $480 on stockx, that's a good discount below MSRP (thanks Intel! since I've been waiting that to upgrade from my 5600X). That's also the same street price I find today for the 12700K; the 12900K is through the roof, it's all out of stock in places like newegg, or $1.5K where I found stock although the KF is much less bad.
Also thanks to all the Intel fans that will burn cash in the first generation of DDR5 (overpriced and also with poor timings) so when Zen4 ships, 1y from today, DDR5 should be affordable and more mature, idem for PCIE5, so we Ryzen users can upgrade smoothly.
Don't waste your time responding, you can't account for abject stupidity. This is the absolute best CPU Intel could possibly build. Ok, it beats AMD by a couple percent in single threaded, but loses by a higher margin in multithreaded while consuming twice the power. Shortly, AMD will easily regain the performance crown with v-cache, while we wait for Zen 4. Sadly another poor review by www.IntelTech.com. Nobody wants a room heater for a CPU.
Last I looked, the vast majority of Anandtech readers don't run long-lasting 100% CPU multithreaded workloads, which is the only scenario where this one CPU falls a long way behind in power consumption.
Competition is good, and Intel has a competitive CPU on its hands, after a long time (for them) without one, and the reviews reflect that fact.
> the vast majority of Anandtech readers don't run > long-lasting 100% CPU multithreaded workloads
How many of us are software developers? My project currently takes about 90 minutes to build on a Skylake i7, and the build is fully multithreaded. I'm looking forward to an upgrade!
I'll point out that the Anand review uses JEDEC standard RAM timings. For DDR5 that's not terrible today, but for DDR4 it is. I mean, DDR4-3200 CL20?? A sister site (Toms) used commonplace DDR4 latencies (3200 CL14) and found it superior to DDR5 (using JEDEC B latencies) for gaming and most tasks, as well as putting ADL comfortably ahead of Zen3 in games. A further BIOS setting they made sure of was to allow ADL to sustain turbo power. Not sure how much that affected results. To be fair I did not hear them enabling PBO on Zen 3, which would be the comparable feature.
But for now I wouldn't be assuming that Ryzen CPUs win even the majority of games, and I absolutely wouldn't assume ADL needs DDR5 to reach its game potential. Most of these reviews out are preliminary, given a short window of time between product sample and NDA lifting.
You mention that you were able to enable AVX-512. Did you have to disable the E-Cores for that option to appear, or was that option available regardless of enabling/disabling the E-Cores? If it is the latter, was the system stable?
>Why use windows 10 for gaming performance ?? >The very same reason they use a 2080Ti.
Because, you pair of dinguses, they want to be able to compare apples-with-apples.
So they benchmark with a defined OS and peripherals to minimise the number of things which change from run to run which means you can directly compare using their benchmarking results CPUs from previous generations.
If you update the GPU as well then all you are doing is benchmarking different combos of GPUs and CPUs and you never end up with stuff which is directly comparable.
If you want simple gaming benchmarks there are any number of websites which will give you that. But for those of us who care about proper benchmarking and directly comparable results, Anandtech always delivers.
And far from being incompetent, doing benchmarking properly requires a lot of work over many hours and requires maintaining the suite and making sure everything keeps ticking along properly.
In short; naff off back to your little troll holes and stop complaining about things that you know nothing of, k?
In regards to the AVX situation, I think it would be best to assume that they will be fused off, as possibly some made it out with the ability to enable it hence the obfuscation in the uefi firmware unless already known
As a couple of things that might be interesting to look at / verify 1 - is it only review samples vs retail? 2 - is it on i9s vs i7s, assuming the i5's would not have it anyway
Not that power consumption matters for a high-end desktop CPUs but Alder Lake platform is more efficient than Zen4 platform in most real world scenarios (read PC World's analysis). So, if you are worried about power consumption that much you should steer away from Ryzen!! I guess your only choice is paying the premium for M1 Max's closed system !
Honestly, fanbois should stop making ignorant comments about power consumption and "TDP" (lol) just to find an excuse to attack a good product because it's not from the company they mindlessly worship ... SMH!
"Honestly, fanbois should stop making ignorant comments about power consumption and "TDP" (lol) just to find an excuse to attack a good product because it's not from the company they mindlessly worship ... SMH!"
This applies just as equally to everybody saying TDP doesn't matter as it does to people pretending that the full-load power of a 12900K is representative of all ADL chips under common loads.
Apple's CPU performance and performance/watt are impressive, but it's going to take a lot more than that to make Intel/AMD start quaking in their boots, and that's not going to happen as long as Apple remains solely a vertical integrator of premium priced computers.
If anything, Apple's recent advances will only galvanize AMD and Intel's CPU designers now they can see what can be achieved, and how.
As long as Apple monopolizes TSMC’s leading edge nodes it really doesn’t matter how much Intel tries until they can get I4 online.
Right now Intel can’t beat TSMC’s N5 or N5P process and AMD can’t afford either. On the flip side that means AMD can’t afford to design a better CPU because they’re also stuck on N7 and N7P.
people are focusing too much on nodes! Apple’s node advantage over AMD isn’t that big in terms of what efficiency you get out of it. AMD is already using the N7+ node in some of its processors, and that puts it just around 10% behind the N5 node used by the M1 Max in performance per watt.
Go from 8p2e to 16p4e and power only goes up to 120W and the M1 scores could double, 106 SPECint2017_r and 162 SPECfp2017_r barring complexity due to memory bus overhead, CPU bus/fabric communication overhead, etc, since it's clear that the rate-n test performs far better when paired with DDR5 vs DDR4
Actually the M1 Max is a 43W part at CPU peak power, not 60W (60 was for the all machine). So when Apple doubles the cores it would be closer to 85W. 170W when using 4 times the cores, which will almost certainly happen. That would mean that Apple could easily have more than double the performance at almost half the power consumption.
It's not true that 12900k must use 300w, in fact, they can get over 90% performance with 150w. If you set voltage manually, you can get a P-core @ 3.2Ghz + E-core @2.4Ghz within 35w (Source: GeekerWan). Its Cinebench R23 score is ST1350, MT14k. What about M1 Max? ST 1500, Mt 12k. In addition, TSMC N5p is 30% better than 10nm ESF. Consider again if a 60W part is competitive at all with a 300W part?
The thing with Cinebench is that it takes a lot of advantage from hyperthreading, which is good of course when you have it, something that the M1 doesn’t have. The problem is, because of this and many other differences between CPUs, Cinebench is only a good benchmark to compare to the M1 in a small set of tasks. Not exactly a general definition of competition. As for power consumption, consider that the M1 Max CPU has a peak power of 43W, while other high end Laptop CPUs have a typical peak power at around 75-80W, even if they say 45W TDP.
I'm literally saying peak power during the test. 6*P-core @0.75v, not the BS TDP, my friend. I totally agree that Cinebench cannot tell everything. But consider the enormous gap between N5P and 10nm ESF. The result is reasonable and for intel fans, is good enough to call it inspiring.
Everyone's going on about the performance and arguing about the power consumption; meanwhile almost all I'm thinking about is how good gracemont pentiums and celerons are going to be for affordable systems.
I'm actually interested to see 6 and 8 core models, but that probably won't happen.
For the spec test on page 7 you write: "For Alder Lake, we start off with a comparison of the Golden Cove cores, both in DDR5 as well as DDR4 variants. We’re pitting them as direct comparison against Rocket Lake’s Willow Cove cores, as well as AMD’s Zen3."
Shouldn't Willow Cove read as Cypress Cove instead?
"There was a thought that if Intel were to release a version of Alder Lake with P-cores only, or if a system had all the P-cores disabled, there might be an option to have AVX-512. Intel shot down that concept almost immediately, saying very succinctly that no Alder Lake CPU would support AVX-512."
P-core/E-core scheduling is not an easy problem, and it has no currently-known general-purpose satisfactory solutions: see https://dl.acm.org/doi/abs/10.1145/3387110 . P/E "works" on phones and tablets because the issues are largely masked by having a single app open at a time. You can't do that in a desktop environment. Hitching the performance of your CPU to unproven scheduler algorithms is not a smart move by Intel. I can see why they've done it, but that doesn't excuse it. You can throw a lot of money at some problems, but that doesn't mean that you're going to get to a good solution. Some of the nastier ones in Computer Science have had billions poured into them and we're no closer to a solution.
My prediction is that in the absence of a very significant breakthrough, hybrid CPUs will continue to be dogged, for the foreseeable future, by weird difficult-to-reproduce performance/power glitches that no amount of firmware/OS updates are going to fix.
Well, if Apple continues to succeed with their hybrid CPUs, it stands to reason that others (Microsoft, Intel, and AMD specifically) will at least model their approach on Apple's: https://www.extremetech.com/computing/322917-cleve...
All operations with a QoS of 9 (background) were run exclusively on the four Efficiency (Icestorm) cores, even when that resulted in their being fully loaded and the Performance cores remaining idle. Operations with any higher QoS, from 17 to 33, were run on all eight cores.
So much energy is put into designing an efficient package. And the result is a packege that churns out even more engergy then the designing took. Even the biggest aircoolers will struggle to keep this cool enough under load. Don't even start about power price and carbon footprint. Worst part being for intel, They can't use this design to gain back some territory in the HEDT and server market. AMD can double the poweroutput for an Epyc or TR. That is not optional for Intel. Lets wait for the tock.
Great article! I especially appreciate avx512 related bits. Honestly it would be great if Intel recover from this chaos and enable avx512 in their adl-derived xeons (E-xxxx).
Doesn't the efficiency of the product just go out the window because Intel is effectively setting the sustained performance power limit at 240W? Obviously this has an impact for consumers, as it's the power draw they will see, but for the purposes of analyzing the architecture or the process node, it doesn't seem like a great way to draw conclusions. There was a test floating around pre-NDA where the PL1 was fixed to 125/150/180/240W, and it looked like the last +60% power draw only gained +8% performance.
To be sure, I'm sure Intel did it because they need that last 8% to beat the 5950X semi-consistently, and most desktop users won't worry about it too much. But if you want to analyze how efficient the P-core architecture + Intel 7 is, wouldn't it make sense to lower the sustained power limit? It's just like clocking anything else at the top of its DVFS curve - I'm sure if it was possible to set Zen 3 to a 240W power limit, we would find it wasn't nearly as efficient as Zen 3 at 141W.
I have to agree with Charlie Demerjian: the sane choices for desktop parts would have been 10 P-cores on one chip and 40 E-cores on another for an edge server part.
Then again for gaming clearly another set of P-cores could have been had from the iGPU, which even at only 32EU is just wasted silicon on a gaming desktop. Intel evidently doesn't mind nearly as much as AMD doing physically different dies so why not use that? (Distinct server SKUs come to mind)
Someone at Intel clearly clearly desperate enough to aim for some new USP and they even sacrified AVS-512 for (or just plain sanity) for that.
Good to see that Intel was able to reap additional instructions from every clock in a P-core (that's the true miracle!).
But since I am still running quite a few Skylake and even Haswell CPUs (in fact typing on a Haswell Xeon server doing double duty as always-on workstation), I am quite convinced that 4GHz Skylake is "good enough" for a huge amount of compute time and would really rather use all that useless E-core silicon to replace my army of Pentium J5005 Atoms, which are quite far from delivering that type of computing power on perhaps more of an electricity budget.
I think most analyses (though not Charlie's) are missing Intel's primary concern here. For THESE parts, Intel is not especially interested in energy saving. That may come with the laptop parts, but for these parts the issue is that Intel wants OPTIONALITY: they want a single part that does well on single threaded (and the real-world equivalent now of 2-3 threaded once you have UI and OS stuff on separate threads) code, AND on high throughput (ie extremely threaded) code. In the past SMT has provided this optionality; now, ADL supposedly extends it -- if you can get 4*.5x throughput for a cluster vs 2*.7x throughput for SMT on a large core, you're area-ahead with a cluster -- but you still need large cores for large core jobs. That's the balance INTC is trying to straddle.
Now compare with the other two strategies.
- For AMD, with chiplets and a high-yielding process, performance/unit area has not been a pressing concern. They're able to deliver a large number of P cores, giving you both latency (ues a few P cores) or throughput (use a lot of P cores). This works fine for server or desktop, but + it's not a *great* server solution. It uses too much area for server problems that are serious throughput problems (ie the area that ARM even in its weak Graviton2 and Ampere versions covers so well) + it does nothing for laptops. Presumably Intel's solution for laptops is not to drive the E-cores nearly as hard, which is not a great solution (compare to Apple's energy or power numbers) but better than nothing -- and certainly better than AMD.
- For Apple there is the obvious point that their P cores are great as latency cores (and "low enough" area if they feel it appropriate to add more) and their E cores great as energy cores. More interesting is the throughput question. Here I think, in true Apple fashion, they have zigged where others have zagged. Apple's solution to throughput is to go all-in on GPGPU!
For their use cases this is not at all a crazy idea. There's a reason GPGPU was considered the hot new thing a while ago, and of course nV are trying equally hard to push this idea. GPGPU works very well for most image-type stuff, for most AI/ML stuff, even for a reasonable fraction of physics/engineering problems. If you have decent tools (and Apple's combination of XCode and Metal Compute seems to be decent enough -- if you don't go in determined to hate them because they're not what you're used to whether that's CUDA or OpenCL) then GPGPU works for a lot of code.
What GPGPU *doesn't work for* is server-type throughput; no-one's saying "I really need to port HHVM, or ngingx, to a GPU and then, man, things will sing". But of course why should Apple care? They're not selling into that market (yet...)
So ultimately - Intel are doing this because it gives them + better optionality on the desktop + lower (not great, but lower) energy usage in the laptop + MANUFACTURING optionality at the server. They can announce essentially a XEON-P line with only P cores, and a Xeon-E line with ?4? cores (to run the OS) and 4x as many E cores as the same-sized Xeon-P, and compete a little better with the ARM servers in the extreme throughput space.
- AMD are behind conceptually. They have their one hammer of a chiplet with 16 cores on it, and it was a good hammer for a few years. But either they missed the value of also owning a small-area core, or just didn't have the resources. Either way they're behind right now.
- Apple remain in the best position -- both for their needs, but also to grow in any direction. They can keep going downward and sideways (watch, phone, glasses, airpods...) with E cores. They can maintain their current strengths with P cores. They can make (either for specialized macs, or for their own internal use) the equivalent of an M1 Max only with a sea of E cores rather than a sea of GPU that would be a very nice target for a lot of server work. And they likely have in mind, long-term, essentially democratizing CUDA, moving massive GPGPU off crazy expensive nV rigs existing at the department level down to something usable by any individual -- basically a replay of the VAX to SparcStation sort of transition.
I think INTC are kinda, in their massive dinosaur way, aware of this hence all the talking up of Xe, PV and OneAPI; but they just will not be able to execute rapidly enough. They have the advantage that Apple started years behind, but Apple moves at 3x their speed. nV are well aware of the issue, but without control of a CPU and a SoC, what can they do? They can keep trying to make CUDA and their GPU's better, but they're crippled by the incompetence of the CPU vendors. AMD? Oh poor AMD. Always in this horrible place where they do what they can do pretty well -- but simply do not have the resources to grow along all the directions they need to grow simultaneously. Just like their E-core response will be later than ideal (and probably a sub-optimal scramble), so too their response to this brave new world of extreme-throughput via extreme GPGPU...
According to the latest rumors of MLID (that most of the times have been proved true) AMD's reply to E-cores is Zen 4D (D for Dense, not dimensions).
Zen 4D core is a stripped-down Zen 4 core, with less cache and lower IPC, but smaller die area and less power consumption, leading to a 16 core CCD.
Also, Zen 4 core is expected to have a lot higher IPC than Alder Lake and Zen 3/3D, so it seems more than capable to compete with Raptor Lake (next Intel's architecture) this time next year.
AMD is not going to lose performance crown in the next few years.
AMD don't *need* E cores, though? You said "it does nothing for laptops", but they're doing fine there - Zen 3 with 8 cores at 15W gives a great balance of ST and MT performance that Intel can't really touch (yet), and the die area is pretty good too. Intel need E cores to compete, AMD don't (yet).
Zen 4D is reportedly going to be their answer to "E" cores, and it's probably going to cause fewer issues than having fully heterogeneous cores.
I get the E/P core story in general, especially where a battery or “the edge” constrain power envelopes. I don’t quite get the benefit of a top-of-the-line desktop SoC serving as a demonstration piece for the concept.
What Intel seems to be aiming for is a modular approach where you can have “lots” or “apartments” of die area dedicated to GPU, P-cores and E-cores. Judging from the ADL die shots very roughly you can get 4 E-cores or 32 iGPU EUs from the silicon real-estate of a P-core (or 16EU + media engine). And with Intel dies using square process yields to go rectangular, you just get a variable number of lots per die variant.
Now, unfortunately, these lots can’t be reconfigured (like an FPGA) between these usage types, so you have to order and pick the combination that suits your use case and for the favorite Anandtech gamer desktop that comes out as 12 P-cores, zilch E-core, iGPU or media engine. I could see myself buying that, if I didn’t already have a 5950x sitting in that slot. I can also very much see myself buying a 2 P-core + 40 E-core variant at top bin for the same €600, using the very same silicon real-estate as the ADL i9.
Intel could (and should) enable this type of modularity in manufacturing. With a bit of tuning to their fabs they should be able to mix lot allocations across the individual dies in a wafer. The wafer as such obviously needs to be set up for a specific lot size, but what you use to fill the lots is a choice of masks.
You then go into Apple’s silicon and I see Apple trying their best to fill a specific niche with three major options, the M1, the M1Pro and the M1Max. None of these cater to gamers or cloud service providers. When it comes to HPC, max ML training performance or most efficient ML inference, their design neither targets nor serves those workloads. Apple targets media people working on the move. For that audience I can see them offer an optimal fit, that no x86-dGPU combo can match (on battery). I admire their smart choice of using 8 32-bit DRAM channels to obtain 400GB/s of bandwidth on DDR4 economy in the M1Max, but it’s not generalizable across their niche workloads. Their ARM core seems an amazing piece of kit, but when the very people who designed it thought to branch out into edge servers, they were nabbed to do better mobile cores instead… The RISC-V people will tell you about the usefulness of a general-purpose CPU focus.
I’ve been waiting for an Excel refactoring ever since Kaveri introduced heterogeneous computing, with function-level GPU/CPU paradigm changes and pointer compatibility. I stopped holding my breath for this end-user GPGPU processing.
On the machine learning side, neuromorphic hardware and wafer-scale whatnot will dominate training, for inference dedicated IP “lots” also seem the better solution. I simply don’t see Apple having any answer or relevance outside their niche, especially since they won’t sell their silicon to the open market.
I’m pretty sure you could build a nice gaming console from M1Max silicon, if they offered it at X-Box prices, but GPT-4++ won’t be trained on Apple silicon and inference needs to run on mobile, 5 Watts max and only for milliseconds.
Well..I mean... it's almost winter so... having problems with central heating? or thought: I'd warm up the garage? There is your answer (has been for a while) ... all you need to do - just give it a full load.
Just decided to let Microsoft's latest Flight Simulator update itself over night. I mean it's a Steam title,right? But it won't use Steam mechanisms for the updates, which would obviously make it much faster and more efficient, so instead you have to run the updates in-game and they just a) linearize everything on a single core, b) thus take forever even if you have fast line, c) leave the GPU running full throttle on the main menu, eating 220Watts on my 2080ti for a static display... (if there ever has been an overhyped and underperforming title the last decade or two, that one gets my nomination every time I want to fly instead of updating).
Small wonder I woke up with a dry throat this morning and the machine felt like a comfy coal oven from way back then, when there was no such thing as central heating and that's how you got through winter (no spring chicken, me).
Oh no! Set carry flag and clear carry flag are gone! How will we manage without them?
Having to deal with the loss of my favorite instuction (AAA, which always stood out first) was already had to deal with, but there it was easier to accept that BCD (~CoBoL style large number arithmetic) wasn't going to remain popular on a 64-Bit CPU with the ability to do some type of integer arithmetic with 512-Bit vectors.
But this looks like the flag register, a dear foe of RISC types, who loved killing it, is under attack!
It's heartening to see that not all CISC is lost since they are improving the x86 gene pool with improving wonderful things like REP cmpsb and REP scasb, which I remember hand-coding via ASM statements on Turbo-C for my 80286: Ah, the things microcode could do!
It even inspired this kid from Finland to think of writing a Unix like kernel, because a task switch could be coded by doing a jump on a task state segement... He soon learned that while the instruction was short on paper, the three page variant from BSD and L4 guys was much faster than Intel's microcode!
He learned to accept others can actually code better than him and that lesson may have been fundamental to the fate of everything Linux has effected.
Great initial review! So many good nuggets in here. Love that AVX512 bit as well as breaking down the cores individually. Alder lake mobile is going to be awesome! DDR5 upgrade sure is tempting, showing tangible benefit, most applications neutral, something to chew on While I figure out if I am going to get a i5 or i9 to replace my i9-10850k.
Request: Will anyone please do a review of iGPU Rocketlake vs Alder Laker. I know it is "identical design" (Intel Xe 32EU) but different node: 14nm vs 10nm/7 (Still not a fan of the rename...) UHD 770 vs UHD 750 showdown! Biggest benefit to Intel right now during these still inflated GPU times.
Eventually someone might actually do that. But don't expect any real difference, nor attractive performance. There have been extensive tests on the 96EU variant in Tiger Lake and 32EU are going to be 1/3 of that rather modest gaming performance.
The power allocation to the iGPUs tends to be "prioritized", but very flat after that. So, if both iGPU and CPU performance are requested, the iGPU tends to win, but then very quickly stops to benefit after it has received its full share, while CPU core fight for everything left in the TDP pie.
That's why iGPU performance say on a Tiger Lake H 45Watt SoC won't be noticeably better than on a 15Watt Tiger Lake U.
The eeny-meeny 32EU Xe will double your UHD 650-750 performance and reach perhaps the equivalent of an Iris 550-650 with 48EU and eDRAM, but that's it.
I'd have put another two P-cores in that slice of silicon...
Not too impressive for a next gen CPU on a supposedly better process node and a newly designed architecture. TechSpot's ten game average for 1080p high quality only put the i9-12900K ahead of the R9 5950X by just 2.6% (7.4% for 1% lows). If AMD's claim of V-cache adding 15% to average gaming results is true, that would give AMD an average lead of 12.4% (7.6% for 1% lows) for the same year-old design with V-cache and still using last generation DDR4 memory - now that is what I would call impressive.
If you have to have top-line FPS and 1% lows, it seems that it would be prudent to just wait a little while longer for what AMD is currently cooking in its ovens.
"supposedly better process node" I mean, considering 12900K is nearly doubling 11900K's multicore performance while consuming 20% less power, I'd say Intel 7 is certainly a large perf/watt improvement over Intel 14nm
But also, efficiency comes from more than just process node. Core architecture plays a role.
I agree that the performance increase over the 11900K is great but the real competition is not Intel's previous generation but AMD's current offerings. eddmann said in an earlier post about i9-12900K's PL1 performance: You can have ~92% of the performance for ~68% of the power - see link below.
In essence, for the benchmark in question, if Intel had set max PL1 limit to just 150W, that processor would have amazing power efficency. But guess what, Intel needed that 8% of additional perfomance to give the i9 daylight in the benchmarks, so that 68% of additional power was needed. It would be interesting to see if Intel decides to offer E variants of its top-line CPUs as they would prove to be super efficient.
From the Alderlake CPUs launched, it would seem that the i5-12600K hits AMD the hardest; it will be interesting to see how AMD responds.
Expect that to come in Q2 next year, where we will also be updating to NVMe testing
You know, I didn't even notice your testbeds all still had SATA drives in them. I just assumed they'd be using whatever was contemporary for the system. This does make me wonder how often a modern benchmark has results that aren't quite what they'll be for users who actually use them. Guess we'll find out in Q2!
Thank you for the in depth review. If I may give a suggestion: investigate in more depth the average power consumption (or energy consumption) in some common tasks, like gaming, etc. There are review out there that found that the i9-12900 is significantly more frugal than a Ryzen 9 5950X, exactly the opposite from the impression one would get by reading this review which focuses on the vastly less insightful peak power metric (you only need this figure to gauge which PSU to buy).
My big question is this... Will waiting 1 more year for the 13th Gen Bring anything else need? Presently I am setting on a i7-3770, I can do everything on it I want to, other then Windows. Sadly, for me Job I do need to upgrdae to 11. I can run the beta for another year, or just dip into this unit. Thoughts?
If all you need is Windows 11 I would just go and buy something like an Intel 11400 or whichever Zen 2 or Zen 3 CPU is available for a competitive price.
Intel 13th Gen will probably iron out all the irks: maybe change AVX512 to be supported on default; probably DDR5-4800 with 4 memory banks instead of 2.
And 14th Gen will probably change to a new manufacturing node.
Great review, thank you very much! If you can do it, I would really like to see those P and E cores in a performance/joules graph together with data points from other relevant CPU cores (say also M1 Max).
Quick correction on your second page: "There was a thought that if Intel were to release a version of Alder Lake with P-cores only, or if a system had all the P-cores disabled, there might be an option to have AVX-512." I guess the E-cores should be disabled for AVX to function?
P vs. E, iGPI vs. cores, there is only one conclusion: I want choice!
While the ability to press more instructions out of a clock and higher clocks out of a slab of silicon are impressive, little impresses the cost of those gains as the performance of the E-cores vs. the silicon real-estate.
A 25-50% performance disadvantage at 25% transistor budget just shows that it's time to refactor your code, turn loops into threads. That won't ever change again, except perhaps with quantum code, so better get started last millenium.
I'd just want to be able to order a bespoke chip, where I can allocate silicon slices and TDP to E-cores, P-cores and iGPU EUs as per use-case on a given die, paying for the number of transistors and a binning premium, not some Intel market segmentation tax.
Intel doesn't seem to have issues manufacturing the various chip sizes and in fact, switching tile allocations for a die even on a single wafer shouldn't be an issue, once you've upgraded the software: The die size wouldn't change, only the allocation to P/E/GPU which means different masks, but those I guess need to be managed separately to account for mask defects/cleaning anyway.
> it's time to refactor your code, turn loops into threads.
We still need lower-overhead hardware support, for this to be viable. Also, the OS should be aware of thread groups that should be scheduled in tandem. Otherwise, your thread communication overhead is going to kill any benefit you gain by this, other than for the sorts of heavy computations that people are already multi-threading.
Very fine tuned review and good one but I think you guys should have compared all the Intel CPUs since SKL to ADL in the Spec scores and the ST performance add the E cores to the mix that would have been a solid opportunity to see how Intel really scaled in these all years.
Also your gaming suite got some small face lift with 2080Ti which shows how the CPUs scale. Looking at the performance and gaming keeping aside the Synthetics. The 12th gen is an improvement but it's not a revolutionary design that changes a lot. It does in terms of Z690 chipset and DDR5 era kickstart but not in terms of relative performance vs Intel 10th gen and Ryzen 5000 series processors. IPC might be double digit but overall picture is not double digit to me. It's a single digit in Gaming, maybe in productivity it's a bit higher over 10th gen but not a massive substantial amount.
The worse is the nightmare of the Win10/11 Scheduler BS, and AMD processors got massive L3 tanking problems. And this is bad showcase for AMD to be honest. Also Win11 is anti PC anti DIY anti everything, it has VBS issues, L3 cache AMD problems, Secure Boot and TPM soft locks, then the whole UX is centered around Mobile rather than power user desktop compute. Better stick with Windows 10, but unfortunately we do not know how ADL will keep up on that OS if Intel issues any Microcode updates to the Intel Thread Director.
Then the HW adoption rate, people buying these are going to be guinea pigs. I would avoid them personally and go with 10th gen (Ryzen has WHEA and USB and other bugs I'm not going there, if B2 and 3D fixes them it's great else nope) and wait for 2-3 years to make DDR5 and PCIe5.0 relevant and then buy another rig in 2024-2025 that will net a solid boost in I/O rather than this generation one product which we do not know how future holds up Intel releases every CPU with 2 Chipset options if we follow the Tick Tock where Tock also gets new Chipset this has been the case since a long time and we do not know what Raptor Lake will do on DDR4, many say it's not worth DDR5 but if you get DDR4 now Z790 which will inevitably launch because Intel loves money that will be a huge dead end.
Finally the temperatures and Power, why I do not see the temperate graphs anywhere ? Did you guys miss that or what. So the Temperature is very HOT for the 12900K, almost all of them recommend an AIO cooler 360mil minimum and the OC ceiling is gone as these run pretty hot even with that MSI 360mil AIO (ofc gaming is low it has been like that since 7th gen), looking at your Power consumption piece the P cores and E cores share the same power plane, this is what I wanted to see and also the Cache due to new Ring bus. So that also confirms some OCN coverage, these E cores boggle the P cores clockspeeds. But running without them = lost performance.
Overall a very complex and complicated messy design. Which I feel Intel did it because of BGA trashware market where ton of money is and since the ST performance despite huge IMC downgrade and Latency is fast for Gaming and parity with AMD full fat Zen 3 and AMD needs time for Zen 4 on TSMC 5N EUV plus more time for their BGA Zen 4 even and, Intel also doesn't want to put more heat since 10nmESF is already capped at this core design, 48W is max for those small cores so there's no way they can fit a 2C4T Golden Cove clocked at 5Ghz on this node. And they are going to stick with this for at-least 2-3 generations I guess. Sad. HEDT will not have this kind of rubbish at-least.
I forgot the most important thing. After going through everything. Intel sabotaged LGA1200 on purpose. Looking at the Cinebench R20 score of the 12900K vs 10900K It's clear cut that 2.5K lead of the 10K marks is coming from the SKL cores you showed. And the P cores are 8K, up from 6K of 10th gen 10900K. They sandbagged the socket with 11th gen on 14nm+++ instead of 10nm and gave a hot, power hungry poor IMC poor SMT processor. Now they show this as a massive boost because it works wonders in charts looking at the big bars of the 12th gen over 10th gen.
Intel is real damn scum. Now they will milk the DDR5, OEM deals and all with those PCIe5.0 and etc BS BGA trash since ST performance is so high they will easily get all those laptops and use and throw machine sales. And nobody knows how long LGA1700 will even last. Maybe upto that successor of Raptor Lake. But going DDR4 is going to bite the people in nuts once Raptor Lake launches, I bet they will launch Z790 and only DDR5 and more DMI or something.
I hope AMD Zen 4 wages a storm on these pesky BS Small cores and give a full powerful big beast with longevity on AM5 LGA. And ofc they will fix the WHEA and USB things because they know now with experience.
I'm not too optimistic that W11 was actually designed to handle optimized/efficiency cores, because W11 was designed before Intel released that beast. Unfortunately that probably means that Windows will continue to be an "every other release is good" OS. W11.1 or W12 (whatever they call it) will be the best continuance of XP -> 7 -> 10 -> 11.1/12.
> W11 was designed before Intel released that beast.
The companies do collaborate, extensively. During their Architecture Day, there was discussion of this. And Windows has a much longer history of hybrid CPU scheduling, since they also support ARM CPUs like Qualcomm's 8cx and even a previous hybrid CPU from Intel (Lakefield).
Also, Windows is not a static target. MS is continually releasing updates, some of which are sure to fine-tune scheduling even more.
[Intel 12th gen consumes less power in gaming across the board vs Ryzen 5000 series](https://www.reddit.com/r/intel/comments/qmw9fl/why... [Even the Multi threaded perf per watt is also better for 12900K compared to 5900X](https://twitter.com/capframex/status/1456244849477... It is only specific cases where 12900k need to beat 5950x in multi threaded loads it needs to crank up more power. But for typical users Intel is both the perf /watt and perf /dollar champion.
Though the review not exceeded my expectations due to lack of in-depth power consumption testing, the charts show to me what really is going on. Intel has roughly matched AMD in CPU performance then gained some more with DDR5. Some CPU ang gaming benchmarks shows that are limited with memory performance. now AMD's V-cache makes even more sense and 15% average uplift in games more plausible. AMD still has the edge (due to cost, cooling requirements, power consumption) while Intel gains more value if one is going to utilize the iGPU.
Personally I don't find all this impressive. Intel went from -5 percent in gaming to +5 percent in gaming. Then went from -10 percent in productivity to beating the 5900x but still losing to the 5950x. Power consumption is terrible. Honestly all AMD has to do is drop their 5600X from $400 to $300 CAD, their 5800X from $500 to $400, their 5900X from $650 CAD to $500, and the 5950x from $900 CAD to $700 and I wouldn't even consider Intel.
Power consumption is a curve, bro. Unlock PBO, 5950x can also eat 300+W. OMG, "Power consumption is terrible." When set P-core@4.4Ghz whose power consumption is less than 120W, its Cinebench score is 1730 single and 25000 multiple, way better than 5900x (1500/21000). I can give it 1000w, can 5900x hit 2000/27000? If you set voltage offset manually, a 6*P-core@3.2Ghz+E-core@2.4Ghz is 35W. At that time, its Cinebench score is 1300 single and 14000 multiple. which is able to compare with a 30W M1 Max, 1550/12000. Not to mention that TSMC N5p is 30% better than intel 10nm ESF.
Sometimes the science is not always aligned to the practicality. The crux of it really, it's unlikely you're able to use a top end air cooler for 12900K other than expensive and elaborate AIO setup (maybe the best of Noctua / DeepCool will be suffice?) but for the Ryzen 5000 series a good quality air cooling will do.
Did you actually read this review? PBO does not do what you said, and anyways it is already faster with PBO off. Then you just reduce the entire review to cinebench... maybe try reading more of it?
Oh, man, honststy already faster with PBO off? Did you actually read this review? Anyway, if you watch some more reviews/videos you will find 12900k product higher fps with a way less power consumption (like 80w vs 120w or 100w vs 140w). Now the fact is 5950x is $739 from newegg while 12900kf is only 589. Don't tell me intel mb is expensive, do you really pair 5950x with a cheap mb?
Quick question: what if Intel to replace E cores with all P cores? The math:
P cores whould have around 60% ipc increase (15% on 14++++, 19% ice lake, 19% alder lake), plus a 1ghz frequency increase, that would result in 2 P cores equals 4.03 skylake cores. And as we see in benchmarks, 8 E cores slightly more performance than 4 skylake cores. That pose a big question: Why not just do 10 P CORES on DESKTOP and remove all the complicates induced?
The reason to keep P-cores is simple. Single-thread performance still matters.
In the server market is where things could get interesting. Intel already has Atom CPUs with >= 16 cores, aimed at mostly embedded server applications that prioritize power efficiency (e.g. 5G basestations). I wonder if we'll see them expand their E-core server CPUs to encompass more traditional server markets. Future E-cores could offer their best chance at countering the perf/W advantage of ARM.
The reviewers didn't ignore anything, including the fact that when you're not pushing the CPU to 100% on all cores, the power draw is much more reasonable. For gaming, it's very similar to the equivalent Ryzen CPUs.
All the reviews I've seen have mentioned the insane power draw for the rendering benchmarks, but they would have trashed the CPUs if it had been the same for all workloads.
Also, the Total Cost of Ownership is about 1800$ without the GPU, which is NOT cheaper than going AM4 5950x.
Don't lure yourself, you need great AIO LQ, a great power supply, DDR5 and really good VRMs on your MB to achieve this... otherwise, the 12900K will throttle.
Not sure if you know what TCO is. It includes electricity and some $ allotment for software support of the hardware. Most end users don't directly deal with the latter (how many $ is a compatibility headache?) and don't own a watt-meter to calculate the former.
That said, what's wrong with reusing DDR4, adapting an AM4 cooler with an LGA 1700 bracket, and reusing the typically oversized PSU? AT shows DDR4 lagging, but most DDR4 out there is way faster than 3200 CL20. That's why other review sites say DDR4 wins most benches. No reasonable user buys a 12900k or 12600k to pair with JEDEC RAM timings!
Really the only cost differentiator is the CPU + mobo. ADL and Zen3 are on very similar process nodes. One is not markedly more efficient than the other unless the throttle is pushed very differently, or in edge cases where their architectural differences matter.
Intel always gives you 2 CPU generations on any new socket. The only exception to that we've seen was Haswell, due to their cancellation of desktop Broadwell.
And besides, they had to change the socket for PCIe 5.0 and DDR5, if not also other reasons.
This wasn't like the "fake" change between Kaby Lake and Coffee Lake (IIRC, some industrial mobo maker actually produced a board that could support CPUs all the way from Skylake to Coffee Lake R).
I think we will. They devoted a whole page to it, and benchmarked it anyway.
And Ian's 3DPM benchmark is still a black box. No one knows precisely what it measures, or that it casts AVX2 performance in a fair light. I will call for him to opensource it, for as long as he continues using it.
For those proclaiming the funeral of AMD, remember, this is like a board game. Intel is now ahead. When AMD moves, they'll be ahead. Ad infinitum.
As for Intel, well done. Golden Cove is solid, and I'm glad to see a return to form. I expected Alder Lake to be a disaster, but this was well executed. Glad, too, it's beating AMD. For the sake of pricing and IPC, we need the two to give each other a good hiding, alternately. Next, may Zen 3+ and 4 send Alder and Raptor back to the dinosaur age! And may Intel then thrash Zen 4 with whatever they're baking in their labs! Perhaps this is the sort of tick-tock we need.
Let's see how it performs within the same power envelope as AMD. That'll tell us if they're truly ahead, or if they're still dependent on burning more Watts for their performance lead.
Oh yes. Under the true metric of performance per watt, AMD is well ahead, Alder Lake taking something to the effect of 60-95% more power to achieve 10-20% more performance than Ryzen. And under that light, one would argue it's not a success. Still, seeing all those blue bars, I give it the credit and feel it won the day.
Unfortunately for Intel and fans, this is not the end of AMD. Indeed, Lisa Su and her team should celebrate: Golden Cove shows just how efficient Zen is. A CPU with 4-wide decode and 256-entry ROB, among other things, is on the heels of one with 6-wide decode and a 512-entry ROB. That ought to be troubling to Pat and the team. Really, the only way I think they can remedy this is by designing a new core from scratch or scaling Gracemont to target Zen 4 or 5.
To be expected, the fabric on 5950X is a big consumer and it's lit up all the time when only a few cores have work, which makes it performance/power ratio worse when under low load.
How exciting that one can pay a lot for a powerful CPU in order to celebrate how it performs doing the tasks a far cheaper CPU would be more suitable for.
This is really droll, this new marketing angle for Alder Lake.
They don't have to redesign Golden Cove. On lightly threaded stuff the 6-wide core is clearly ahead. That's a big plus for many consumers over Zen 3. The smaller competing core is expectedly more efficient and easier to pack for multicore but doesn't have the oomph. That Intel can pack both bigger snappy cores and smaller efficient cores is what should keep Su wide awake.
Notice the ease in manufacturing, too. ADL is a simple monolithic slab. Ryzen is using two CCDs and one IOD on interposer. That's one reason Zen3 was in short supply a good 6-8 months after release. It wasn't because TSMC had limited capacity for 88mm2 chips on N7. Intel can spam the market with ADL, the main limit being factory yields of the 208 mm2 chip on Intel 7.
> On lightly threaded stuff the 6-wide core is clearly ahead.
Why do people keep calling it 6-wide? It's not. The decoder is 3 + 3. It can't decode 6 instructions per cycle from the same branch target.
From the article covering the Architecture Day presentation:
"the allocation stage feeding into the reservation stations can only process five instructions per cycle. On the return path, each core can retire eight instructions per cycle."
> That's one reason Zen3 was in short supply a good 6-8 months after release. > It wasn't because TSMC had limited capacity for 88mm2 chips on N7.
Source?
Shortage of Ryzens was due *in part* to the fact that Epyc and Threadrippers draw from the same chiplet supply as the non-APU desktop chips. And if you tried to buy a Milan Epyc, you'd know those were even harder to find than desktop Ryzen 5000's.
AMD seems to be moving towards a monolithic approach, in Zen 4. Reportedly, all of their desktop CPUs will then be APUs.
Oh we can definitely agree to disagree on how wide to call Golden Cove, but it's objectively bigger than Zen 3 and performs like a bigger core on just about every lightly threaded benchmark.
The theory that TSMC was simply running that short on Zen3 CCDs never made much sense to me. Covid didn't stop any of TSMC's fabs, almost all of which run fully automated. For over a year they'd been churning out Zen2's on N7 for desktop/laptop and then server, so yields on 85 mm2 of the newer Zen3 on the same N7 should have been fantastic, and they weren't going to server, not till much more recently.
But Covid impacts on the other fabs that make IOD/interposer, and the technical packaging steps, and transporting the various parts in time? Far, far more likely.
> Oh we can definitely agree to disagree on how wide to call Golden Cove
Sorry, I thought you were talking about Gracemont. The arch day article indeed says it's 6-wide and not much else about the decode stage.
> Covid didn't stop any of TSMC's fabs, almost all of which run fully automated.
It triggered a demand spike, as all the kids and many office workers needed computers for school/work from home. Plus, people needing to do more recreation at home seems to have triggered an increased demand for gaming PCs.
It's well known that TSMC is way backlogged. So, it's not as if AMD could simply order up more wafers to address the demand spikes.
> they weren't going to server, not till much more recently.
Not true. We know Intel and AMD ship CPUs to special customers, even before their public release. By the time Ice Lake SP launched, Intel reported having already shipped a couple hundred thousand of them. Also, AMD needs to build up inventory before they can do a public release. So, the chiplet supply will be getting tapped for server CPUs long before the public launch date.
> the only way I think they can remedy this is by designing a new core from scratch
I'm not sure I buy this narrative. In the interview with AMD's Mike Clark, he said AMD takes a fresh view of each new generation of Zen and then only reuses what old parts still fit. As Intel is much bigger and better-resourced, I don't see why their approach would fundamentally differ.
> or scaling Gracemont to target Zen 4 or 5.
I don't understand this. The E-cores are efficiency-oriented (and also minimize area, I'd expect). If you tried to optimize them for performance, they'd just end up looking & behaving like the P-cores.
I stand by my view that designing a CPU from scratch will bring benefit, while setting them back temporarily. Of course, am no expert, but it's reasonable to guess that, no matter how much they change things, they're still being restricted by choices made in the Pentium Pro era. In the large, sweeping points of the design, it's similar, and that is exerting an effect. Start from scratch, and when you reach Golden Cove IPC, it'll be at lower power I think. Had AMD gone on with K10, I do not doubt it would never have achieved Zen's perf/watt. Sometimes it's best to demolish the edifice and raise it again, not going to the opposite extreme of a radical departure.
As for the E-cores, if I'm not mistaken, they're at greater perf/watt than Skylake, reaching the same IPC more frugally. If that's the case, why not scale it up a bit more, and by the time it reaches GC/Zen 3 IPC, it may well end up doing so with less power. Remember the Pentium M.
What I'm trying to say is, you've got a destination: IPC. These three architectures are taking different routes of power and area to get there. GC has taken a road with heavy toll fees. Zen 3, much cheaper. Gracemont appears to be on an even more economical road. The toll, even on this path, will go up but it'll still be lower than GC's. Zen, in general, is proof of that, surpassing Intel's IPC at a lower point of power.
Anyhow, this is just a generic comment by a layman who's got a passion for these things, and doesn't mean to talk as if he knows better than the engineers who built it.
It's not trivial to design a core from scratch without defining an instruction set from scratch, i.e., breaking all backward compatibility. x86 has a tremendous amount of legacy. ARM has quite a bit as well, and growing each year.
Can they redo Golden Cove or Gracemont for more efficiency at same perf/more perf at same efficiency? Absolutely, nothing is perfect and there's no defined tradeoff between performance and efficiency that constitutes perfect. But simply enlarging Gracemont to near Golden Cove IPC (a la Pentium M to Conroe) is not it. By doing so you gradually sacrifice the efficiency advantage in Gracemont, and might get something worse than Golden Cove if not optimized well.
The big.LITTLE concept has proven advantages in mobile and definitely has merit with tweaks/support on desktop/server. The misconception you may have is that Golden Cove isn't an inherently inefficient core like Prescott (P4) or Bulldozer. It's just sometimes driven at high turbo/high power, making it look inefficient when that's really more a process capability than a liability.
Putting together a new core doesn't necessarily mean a new ISA. It could still be x86.
Certainly, Golden Cove isn't of Prescott's or Bulldozer's nature and the deplorable efficiency that results from that; but I think it's pretty clear that it's below Zen 3's perf/watt. Now, Gracemont is seemingly of Zen's calibre but at an earlier point of its history. So, if they were to scale this up slowly, while scrupously maintaining its Atom philosophy, it would reach Zen 3 at similar or less power. (If that statement seems laughable, remember that Skylake > Zen 1, and Gracemont is roughly equal to Skylake.) Zen 3 is right on Golden Cove's tail. So why couldn't Gracemont's descendant reach this class using less power? Its design is sufficiently different from Core to suggest this isn't entirely fantasy.
And the fashionable big/little does have advantages; but question is, do those outweigh the added complexity? I would venture to say, no.
> they're still being restricted by choices made in the Pentium Pro era.
No way. There's no possible way they're still beholden to any decisions made that far back. For one thing, their toolchain has probably changed at least a couple times, since then. But there's also no way they're going to carry baggage that's either not pulling its weight or is otherwise a bottleneck for *that* long. Anything that's an impediment is going to get dropped, sooner or later.
> As for the E-cores, if I'm not mistaken, they're at greater perf/watt than Skylake
Gracemont is made on a different node than Skylake. If you backported it to the original 14 nm node that was Skylake's design target, they wouldn't be as fast or efficient.
> why not scale it up a bit more, and by the time it reaches GC/Zen 3 IPC, > it may well end up doing so with less power.
Okay, so even if you make everything bigger and it can even reach Golden Cove's IPC without requiring major parts being redesigned, it's not going to clock as high. Plus, you're going to lose some efficiency, because things like OoO structures scale nonlinearly in perf/W. And once you pipeline it and do the other things needed for it to reach Golden Cove's clock speeds, it's going to lose yet more efficiency, probably converging on what Golden Cove's perf/W.
There are ways you design for power-efficiency that are fundamentally different from designing for outright performance. You don't get a high-performance core by just scaling up an efficiency-optimized core.
Well, you've stumped me on most points. Nonetheless, old choices can survive pretty long. I've got two examples. Can't find any more at present. The instruction fetch bandwidth of 16 bytes, finally doubled in Golden, goes all the way back to Pentium Pro. That could've more related to the limitations of x86 decoding, though. Then, register reads were limited to two or three per clock cycle, going back to Pentium Pro, and only fixed in Sandy Bridge. Those are small ones but it goes to show.
I would say, Gracemont is different enough for it to diverge from Golden Cove in terms of perf/watt. One basic difference is that it's using a distributed scheduler design (following in the footsteps of the Athlon, Zen, and I believe the Pentium 4), compared to Pentium Pro-Golden Cove's unified scheduler. Then, it's got 17 execution ports, more than Zen 3's 14 and GC's 12. It's ROB is 256 entries, equal to Zen 3. Instruction boundaries are being marked, etc., etc. It's clock speed is lower? Well, that's all right if its IPC is higher than frequency-obsessed peers. I think descendants of this core could baffle both their elder brothers and the AMD competition.
That's for simplicity, not by necessity. Most CPUs map multiple different sorts of operations per port, but Gracemont is probably designed in some way that made it cheaper for them just to have dedicated ports for each. I believe its issue bandwidth is 5 ops/cycle.
> It's clock speed is lower? Well, that's all right if its IPC is higher than frequency-obsessed peers.
It would have to be waaay higher, in order to compensate. It's not clear if that's feasible or the most efficient route to deliver that level of performance.
> I think descendants of this core could baffle both their elder brothers and the AMD competition.
In server CPUs? Quite possibly. Performance per Watt and per mm^2 (which directly correlates with perf/$) could be extremely competitive. Just don't expect it to outperform anyone's P-cores.
I'm out of answers. I suppose we'll have to wait and see how the battle goes. In any case, what is needed is some new paradigm that changes how CPUs operate. Clearly, they're reaching the end of the road. Perhaps the answer will come from new physics. But I wouldn't be surprised there's some fundamental limit to computation. That's a thought.
"Really, the only way I think they can remedy this is by designing a new core from scratch" Intel is designing an entirely new architecture from scratch, according to Moore's Law is Dead leaks. This new architecture design, which started under Jim Keller, is called the "Royal Core Project" and is aimed for a 2025 release. This year also aligns with Gelsinger's recent claims of "We expect Intel to definitively retake the performance crown across the board in 2025"
Whether that actually happens is to be seen, but Pat's specific year target combined with the Moore's Law is Dead leak seem to suggest a whole new architecture is very likely.
If he had any hand in Alder lake, it was probably at the margins. He arrived too late to have much involvement in the CPU and his role at Intel seems to have been more in the area of new technology development.
Looks like a 100% clickbait article that's all weakly-inform speculation and zero substance. Go read the two recent interviews with him on this site. He talks about his position within Intel, how far removed he was from any actual work, and how his focus was more on evangelizing within the company.
And, ignoring all of that, it takes 4-5 years to design a CPU and get it shipped. The timelines simply don't match up. Lastly, he was there for under 2 years, which isn't enough time to really learn how a company works and build a strong team.
Even at AMD, where he was for 3+ years, he even refused credit as the father of Zen. And it's a smaller company where he had prior history.
The fact that Alder Lake consumes way less power than Zen 3 during gaming is amazing. This is where it matters the most because nobody will be playing Cinebench 12 hours a day. 5950x wastes a lot of powers for no good reason during gaming. I'm surprised Intel didn't advertise this heavily.
Ian, please publish the source to your 3D Particle Movement benchmark. Let us see what the benchmark is doing. Also, it's not only AMD that can optimize the AVX2 path. Please let the community have a go at it.
> The core also supports dual AVX-512 ports, as we’re detecting > a throughput of 2 per cycle on 512-bit add/subtracts.
I thought that was true of all Intel's AVX-512 capable CPUs? What Intel has traditionally restricted is the number of FMAs. And if you look at the AVX-512 performance of 3DPM on Rocket Lake and Alder Lake, the relative improvement is only 6%. That doesn't support the idea that Golden Cove's AVX-512 is any wider than that of Cypress Cove, which I thought was established to be single-FMA.
Cascade lake X and Skylake X/XE core i9 and Xeons with more that 12 cores (it think) have two AVX-512 capable FMA ports (port 0 and port 5) while all other AVX-512 capable CPUs have 1 (Port 0 fused).
the performance gap could be down to coding. you need to vectorize your code in such a way that you feed both ports at maximum bandwidth.
However, in practice it turns out that the bottle neck is seldom the AVX-512 FMA ports but the memory bandwidth, i.e. it is very hard to keep up with the FMAs, each capable of retiring many of the high end vector operations in 4 clock cycles. e.g. multiply two vectors of 16 32bit floats and add to a 3rd vector in 4 clock cycles. Engaging both FMAs => you retire one FMA vector op every 2 cycles. Trying to avoid getting too technical here, but with a bit of math you see that the total bandwidth capability of the FMAs easily outstrips the cache, even if most vectors are kept in the Z registers – the resisters can only absorbs so much and at the steady state, the cache/memory hierarchy becomes the bottleneck depending on the problem size.
Some clever coding can work around that and hide some of the memory reads (using prefetching etc) but again there is only so much you can do. In other words two AVX-512 FMAs are beasts!
This doesn't make sense. Their P-cores were never suitable for phones or tablets. Still aren't.
I think the one thing we can say is *not* behind Alder Lake is the desire to make a phone/tablet chip. It would be way too expensive and the P-core would burn too much power at even the lowest clockspeeds.
It appears the mixing is more trouble than they are worth for pure mid to high range desktop use. Intel should have split the Desktop CPU's from the mobile CPU's. Put P-cores in the new mid to high range desktops. Put the E-cores in mobiles or cheap desktops/NUC.
The mixing helps with a very sought-after trait of high-end desktops. Fast single/lightly threaded performance AND high multithreaded capacity. Meaning very snappy and can handle a lot of multitasking. It is true they can pump out more P cores and get rid of E cores, but that would balloon the die size and cut yields, spiking the cost.
Yes. This is supported with a very simple experiment. Look at the performance delta between 8 P-Cores and the full 8 + 8 configuration, on highly-threaded benchmarks. All the 8 + 8 configuration has to do is beat the P-core -only config by 25%, in order to prove it's a win.
The reason is simple. Area-wise, the 8 E-cores are equivalent to just 2 more P-cores. The way I see it is as an easy/cheap way for Intel to boost their CPU on highly-threaded workloads. That's what sold me on it. Before I saw that, I only thought Big.Little was good for power-savings in mobile.
Forgot to add that page 9 shows it meets this bar (I get 25.9%), but the reason it doesn't scale even better is due to the usual reasons for sub-linear scaling. Suffice it to say that a 10 P-core wouldn't scale linearly either, meaning the net effect is almost certainly better performance in the 8+8 config (for integer, at least).
"In the aggregate scores, an E-core is roughly 54-64% of a P-core, however this percentage can go as high as 65-73%."
It isn't clear what you mean here. A P-core second thread on the same core would be expected to add around 30%.
A more understandable test would something like Intel presented of Gracemont 4C4T vs Skylake 2C4T, although it would also be interesting to see performance and power of 8C8T vs 2C4T of Golden Cove, since they reportedly occupy a similar layout space.
Really happy to see AVX-512 is available with a simple BIOS switch! This looks to me like how AVX-512 should have been implemented in Sky lake, Cascade lake and Rocket lake and now they finally are getting it right: Alder lake seams to have: - both AVX-512 ports enabled (port 0 and 5) ! - able to run at negative offset = 0 for both AVX2 and AVX-512! - AVX-512 power consumption seams too be in line with AVX2! Excellent in other words! Since the silicon is there, if they can get the scheduler to manage heterogeneous (P/E) cores there is now no down side with enabling AVX-512.
I guess you missed the sentence about how the MSI boards don’t have the switch, the sentence about how it’s actually not supposed to be there, and the sentence about how it could be eliminated in the future.
Additionally, what high-end motherboards offer in BIOS may be more than what is offered in more affordable models. Vendors might restrict this unofficial ‘support’ to top models.
The entire situation is completely incompetent. It’s patently absurd.
> It raises a very serious question about Gelsinger’s leadership.
I'm sure this decision never crossed his desk. It would be made probably 2+ levels below him, in the management hierarchy.
Moreover, he's been in charge for only about 8 months or so. Do you have any idea how long it takes to steer a big ship like Intel? This decision wasn't made yesterday. It would require OS support, which means they'd have had to get buy-in from Microsoft for it, many months ago.
And that's just if you're talking about the decision not to allow partial AVX-512 enabling. The decision to exclude it from Gracemont was made many years ago. Its exclusion was possibly considered a necessity for Gracemont's success, due to the perf/area -> perf/$ impact.
If Gelsinger wasn’t aware of the lie about fusing off and all of the other critically-important aspects involved he’s either a charlatan or Intel is structurally incompetent.
Why all the fuss about a technically unsupported feature? The only consumer chips officially to have AVX-512 contain Rocket Lake cores. Not Zen 3, or 2, or Comet Lake, or Alder Lake. If you find your Alder Lake has hidden AVX-512 abilities, how's that any different from finding out you can enable 6 cores on your 4-core Celeron?
On what basis do you reach that verdict? Based on your posts, I wouldn't trust you to run a corner shop, much less one of the biggest and most advanced tech & manufacturing companies on the planet.
And where did they said it's "fused off"? AFAIK, all they said is that it's not available and this will not change. And we've seen no evidence of that being untrue.
Also, I think you're getting a bit too worked up over the messaging. In the grand scheme, that's not the decision that really matters.
no I did no miss that. I'm just happy that ASUS found a way to enable it. Intel screwed up of course - battel between different departments and managers, marketing etc I'm sure - that's a given and did not think it was necessary to repeat that. And yes it is absurd - even incompetent. Still I'm happy ASUS found it and exposed it, because, as I said they they actually seam to have gotten AVX-512 right in Golden cove. Intel should of course work with Microsoft to get the scheduler to work with any E/P mix, make the support official, enable it in the base BIOS, have base BIOS sent over to all OEMs and lastly fire/reassign the idiot that took AVX-512 off the POR for Alder lake. In any case it give me something to look forward to with Sapphire rapids which should come with more Golden cove P cores. I only by ASUS boards so
can't edit post so continuing: I only buy and use ASUS boards so for me it's fine but I sucks for others. Also doubt that Pat was involved. Decisions were likely made before his arrival. I'm thinking about the Microsoft dependency. They would have needed to lock the POR towards Microsoft a while back to give MS enough time to get the scheduler and other stuff right...
The product was released on his watch, on these incompetent terms. Gelsinger is absolutely responsible. He now has a serious black mark on his leadership card. A competent CEO wouldn’t have allowed this situation to occur.
This is an outstanding example of how the claim that only engineers make good CEOs for tech companies is suspect.
I totally agree that it goes against putting engineers in charge. for me the whole AVX-512 POR decision and "AVX-512 is fused off" message is coming out of a incompetent marketing department when they were still in charge.
i think under the old "regime" marketing did out rank engineering so that the old CEO listened more to marketing than engineering (of course the CEO makes the decision but he takes input from various camps and that is what i mean with marketing "were still in charge"). A non-engineering educated CEO is particularly influenceable by marketing (especially if he/she has MBA with marketing specialization like the old CEO's). Hence the messaging decisions to "fuse it off" was likely heavily influenced by marketing who i think finally won over Engering. Pat had to inherit this decision but could not change i for windows 11 launch - it was too late.
> so that the old CEO listened more to marketing than engineering
In this case, the issue wouldn't be who the CEO listens to, but who gets to define the products. Again, the issue of AVX-512 in Alder Lake is something that would probably never rise to the attention of the CEO, in a company with $75B annual revenue, tens of thousands of employees at hundreds of sites, and many thousands of products in hundreds of different markets. OG apparently has no concept of what these CEOs deal with, on a day to day basis.
> Gelsinger wasn’t the CEO when Alder Lake was released?
So what was he supposed to do? Do you think they run all PR material by the CEO? How would he have any time to make the important decisions, like about running the company and stuff?
It seems to me like you're just playing agent provocateur. I haven't even seen you make a good case for why this matters so much.
> A competent CEO wouldn’t have allowed this situation to occur.
And you'd know this because... ?
> This is an outstanding example of how the claim that only engineers > make good CEOs for tech companies is suspect.
You're making way to much of this.
I don't know who says "only engineers make good CEOs for tech companies". That's an absolutist statement I doubt anyone reasonable and with suitable expertise to make such proclamations ever would. There are plenty of examples where engineers in the CEO's chair have functioned poorly, but also many where they've done well. The balance of that particular assessment doesn't hang on Gelsinger, and especially not on this one issue.
Also, your liberal arts degree is showing. I'm not casting any aspersions on liberal arts, but when you jump to attack engineers for stepping outside their box, it does look like you've got a big chip on your shoulder.
> Your assumption about my degrees is due to the fact that I understand leadership and integrity?
Yes, exactly. It's exactly your grasp of leadership and integrity that I credit for your attack on engineers stepping outside their box. Such a keen observation. /s
I'm not sure how familiar you are with CPU design, but Alder Lake was taped in before Gelsinger took over. The design was finalized, and there was no changing it without massive delays. For the miniscule amount of the market that insists on AVX-512 for the consumer line, it can be implemented after disabling E Cores. AVX-512 just doesn't work on Gracemont, so you can't have both Gracemont and AVX-512 simultaneously. CPU designs take 4 years. You'll see the true impact of Gelsingers leadership in a few years.
MS and intel tried to sync their plans to launch Windows 11 and Alderlake at (roughly) the same time. intel might have been rushed to lock their POR to hit Windows 11 launch. There may even be a contractual relationship between Intel and Microsoft to make sure Windows 11 runs best on Intel's Alder Lake - Intel pay MS to optimize the scheduler for Alder lake and in return Intel has to lock the Alder Lake POR maybe even up to a year go... because MS was not going to move the Windows 11 launch date.
Speculation from my side of course, but I don't think I am too far off...
yes it is inexcusable BUT the Pat might not have had a choice because he does not control Microsoft. Satya N. would just tell Pat we have a contract - fulfill it! We are not going to delay Windows 11 it's shipping October 2021 so we will stick with the POR you gave us in 2020! Satya is running a $2.52 trillion market cap company current #1 in the world Pat is running a $206.58 billion market cap company so guess who's calling the shots. Pat says "ok... but maybe we can enable it for the 22H1 version of win 11, please Satya help me out here..." in the end I think MS will do the right thing and get it to work but it might get delayed a bit. Again, my speculation. And again, I don't think I am far off...
The solution was not to create this incompetent partial ‘have faith’ AVX-512 situation. Faith is for religion, not tech products.
The solution was to be clear and consistent. For instance, if Windows is the problem then that should have been made clear. Gelsinger should have said MS doesn’t yet have its software at full compatibility with Alder Lake. He should have said it will be officially supported when Windows is ready for it.
He should have had a software utility for power users to disable the small cores in order to have AVX-512 support, or at least a BIOS option — mandated for all Alder Lake boards — that disables them as an option for those who need AVX-512.
The current situation cannot be blamed on Microsoft. Intel has the ability to be clear, consistent, and competent about its products.
Claiming that Intel isn’t a large enough entity to tell the truth also doesn’t pass muster. Even if it’s inconvenient for Microsoft to be exposed for releasing Windows 11 prematurely and even if it’s inconvenient for Intel to be exposed for releasing Alder Lake prematurely — saving face isn’t an adequate excuse for creating a situation this untenable.
Consumers deserve non-broken products that aren’t sold via smoke and mirrors tactics.
a couple of points: - yes it would have been better to communicate to the market that AVX-512 will be enabled with 22H1 (or what ever - speculating) of windows 11 but what about making it work with windows 10 and when... i mean the whole situation it's a cluster. I do agree that the current marketing decision under Pat's what and how to communicate to the market what is happening with Alder Lake and AVX-512 and Windows 10/11 could have been handled much, much better. the way they have done it is a disaster. it's like is it in or out i mean wtf. is it strategic or not. This market communicating, related decisions and what every new agreements they need to strike with Microsoft to make the whole thing make sense is on Pat - firmly! - i am not blaming Microsoft at all. I am mostly blaming the old marketing and the old CEO - pure incompetence for getting Intel into this situation in the first place. I don't have all the insights into Intel's internals but from an outside perspective it looks like that to me.
Gelsinger’s responsibility is to lead, not blame previous leadership.
Alder Lake came out on his watch. The AVX-512 debacle, communications and lack of mandated minimum specs (official partial support for the lifetime of AL in 100% of AL boards via BIOS switch to disable small cores) happened to while he was CEO.
The lie about fusing off happened under his leadership.
We have been lied to and spacetime can’t be warped to erase the stain on his tenure.
I don't think Pat's blaming previous leadership. I am. I also blame Pat to per above. He can still fix it by being super clear about what, when, why in his communication. He needs to bring marketing messaging under control. I can tell one thing though. I'm not buying Alder Lake CPUs. I'm probably going for Sapphire Rapids next year when the whole thing have hopefully settled a bit more.
> Posting drivel like that exposes your character for all to see.
What I hope they see is that I can keep a measure of proportion about these things and not get overwrought. Not all "righteous indignation" is so righteous.
So far, they have. No AVX-512. That ASUS figured out it was still present and could be enabled isn't Intel's fault. It's like back in the days when some motherboards would let you enable dark cores in your CPU.
> Gelsinger should have said
It's not his job. Your issue is with someone at Intel several levels below him.
> Lying to the public (‘fused off’) is a decision that rests on his shoulders.
Exactly where did they say it's "fused off", and when has it ever been inexcusable that hardware shipped to customers actually contains features that can secretly be enabled? This sort of thing happens all the time.
I have been suspicious that you’re some sort of IBM AI. Posts like that go a long way toward supporting that suspicion.
You were the poster who claimed it’s of little consequence. I was the poster who said it’s inexcusable. Either you’re AI that needs work or your mind is rife with confusion in your quest to impress the community via attempts at domination.
Not a good look, again. Posting your own claims as if they’re mine and using my claims to create a false incompetence situation is a bit better than your pathetic schoolyard taunts. So, perhaps I should praise you for improving the quality of your posts via being merely incompetent — like Intel’s handling of this situation you’re trying to downplay. I shouldn’t make that equivalence, though, as lying to the community in terms of a retail product is worse than any of your parlor tricks.
> I have been suspicious that you’re some sort of IBM AI.
No way. Their artificial intelligence is no match for my natural stupidity. :D
> You were the poster who claimed it’s of little consequence.
No, I asked *you* why it's so consequential.
> I was the poster who said it’s inexcusable.
Which sort of implies that it's very consequential. If it's of not, then why would it be inexcusable?
> Either you’re AI that needs work or your mind is rife with confusion in your quest to > impress the community via attempts at domination.
If you wouldn't waste so much energy posturing and just answer the question, maybe we could actually get somewhere.
I don't honestly care what the community thinks of me. That's the beauty of pseudonymity! I don't even need people to believe I'm somehow affiliated with a prestigious university. Either my points make sense and are well-founded or they aren't. Similarly, I don't care if you're "just" the Oxford garbage collector. If you contribute useful information, then we all win. If you're just trolling, flaming, or pulling the thread into irrelevant tangents, then we all lose.
The main reason I post on here is to share information and to learn. I asked what should be a simple question which you dismissed as meritless, and without explaining why. As usual, only drama ensues, when I try to press the issue. I always want to give people the opportunity to justify their stance, but so often you just look for some way to throw it back in my face.
This kind of crap is extremely low value. I hope you agree.
> Since the silicon is there, if they can get the scheduler to manage > heterogeneous (P/E) cores there is now no down side with enabling AVX-512.
This will not happen. The OS scheduler cannot compensate for lack of app awareness of the heterogeneous support for AVX-512. I'm sure that was fiercely debated, at Intel, but the performance downsides for naive code (i.e. 99%+ of the AVX-512 code in the wild) would generate too many complaints and negative publicity from the apps where enabling it results in performance & power regressions.
So, Alder Lake is a turkey as a high-end CPU, one that should have never been released? This is because each program has to include Alder Lake AVX-512 support and those that don’t will cause performance regressions?
So, Intel designed and released a CPU that it knew wouldn’t be properly supported by Windows 11 — yet the public was sold Windows 11 primarily on the basis of how its nifty new scheduler will support this CPU?
‘The OS scheduler cannot compensate for lack of app awareness of the heterogeneous support for AVX-512’
Is Windows 11 able to support a software utility to disable the low-power cores once booted into Windows or are we restricted to disabling them via BIOS? If the latter is the case then Intel had the responsibility for mandating such a switch for all Alder Lake boards, as part of the basic specification.
> So, Alder Lake is a turkey as a high-end CPU, one that should have never been released?
How do you reach that conclusion, after it blew away its predecessor and (arguably) its main competitor, even without AVX-512?
> This is because each program has to include Alder Lake AVX-512 support and > those that don’t will cause performance regressions?
No, my point was that relying on the OS to trap AVX-512 instructions executed on E-cores and then context-switch the thread to a P-core is likely to be problematic, from a power & performance perspective. Another issue is code which autodetects AVX-512 won't see it, while running on an E-core. This can result in more than performance issues - it could result in software malfunctions if some threads are using AVX-512 datastructures while other threads in the same process aren't. Those are only a couple of the issues with enabling heterogeneous support of AVX-512, like what some people seem to be advocating for.
> Is Windows 11 able to support a software utility to disable the low-power cores > once booted into Windows or are we restricted to disabling them via BIOS?
That's not the proposal to which I was responding, which you can see by the quote at the top of my post.
> The question about the software utility is one you’re unable to answer, it seems.
That's not something I was trying to address. I was only responding to @SystemsBuilder's idea that Windows should be able to manage having some cores with AVX-512 and some cores without.
If you'd like to know what I think about "the software utility", that's a fair thing to ask, but it's outside the scope of what I was discussing and therefore not a relevant counterpoint.
"So, Intel designed and released a CPU that it knew wouldn’t be properly supported by Windows 11"
Oxford Guy, there's a difference between the concerns of the scheduler and that of AVX512. Alder Lake runs even on Windows 10. Only, there's a bit of suboptimal scheduling there, where the P and E cores are concerned.
If AVX512 weren't disabled, it would've been something of a nightmare keeping track of which cores support it and which don't. Usually, code checks at runtime whether a certain set of instructions---SSE3, AVX, etc---are available, using the CPUID instruction or intrinsic. Stir this complex yeast into the soup of performance and efficiency cores, and there will be trouble in the kitchen.
Under this is new, messy state of affairs, the only feasible option mum had, or should I say Intel, was bringing the cores onto a equal footing by locking AVX512 in the attic, and saying, no, that fellow doesn't live here.
Thinking a bit about what you wrote: "This will not happen". And it is not easy but possible… it’s a bit technical but here we go… sorry for the wall of text.
When you optimize code today (for pre Alder lake CPUs) to take advantage of AVX-512 you need to write two paths (at least). The application program (custom code) would first check if the CPU is capable of AVX-512 and at what level. There are many levels of AVX-512 support and effectively you need write customized code for each specific CPUID (class of CPUs , e.g. Ice lake, Sky lake X etc.) since for whatever CPU you end up running this particular program on, you would want to utilize the most favorable/relevant AVX-512 instructions. So with the custom code today (Pre Alder lake) the scheduler would just assign a tread to a underutilized core (loosely speaking) and the custom code would check what the core is capable off and then chose best path in real time (AVX2 and various level of AVX-512). The problem is that with Alder Lake not all cores are equal! BUT the custom code should have various paths already so it is capable!… the issue that I see is that the custom code CPU check needs to be adjusted to check core specific capability not CPUID specific (one more level of granularity) AND the scheduler should schedule code with AVX-512 paths on AVX-512 capable cores by preference... what’s needed is a code change in the AVX-512 path selection logic ( on the application developer - not a big deal) and compiler support that embed scheduler specific information about if the specific piece of code prefers AVX-512 or not. The scheduler would then use this information to schedule real time and the custom code would be able to choose the right path at execution time. It is absolutely possible and it will come with time. I think this is that this is not just applicable to AVX-512. I think in the future P and E cores might have more than just AVX-512 that is different (they might diverge much more than that) so the scheduler needs to be made aware of what a thread prefers and what the each core is capable of before it schedules each tread. It is the responsibility of the custom code to have multiple paths (if they want to utilize AVX-512 or not).
old .exe which are not adjusted and are not recompiled for Alder Lake (code does not recognize Alder Lake) would simply automatically regress to AVX2 and the scheduler would not care which CPU to schedule it on. Basically that is what's happening today if you do not enable AVX-512 in the ASUS bios.
> old .exe which are not adjusted and are not recompiled for Alder Lake (code does > not recognize Alder Lake) would simply automatically regress to AVX2
So, like 98% of shipping AVX-512 code, by the time Raptor Lake is introduced?
What you're proposing is a lot of work for Microsoft, only to benefit a very small number of applications. I think Intel would rather that people who need those apps simply buy CPU which officially support AVX-512 (or maybe switch off their E-cores and enable AVX-512 in BIOS).
‘or maybe switch off their E-cores and enable AVX-512 in BIOS’
This from exactly the same person who posted, just a few hours ago, that it’s correct to note that that option can disappear and/or be rendered non-functional.
I am reminded of your contradictory posts about ECC where you mocked advocacy for it (‘advocacy’ being merely its mention) and proceeded to claim you ‘wish’ for more ECC support.
Once again, it’s helpful to have a grasp of what one actually believes prior to posting. Allocating less effort to posting puerile insults and more toward substance is advised.
> This from exactly the same person who posted, just a few hours ago, that it’s > correct to note that that option can disappear and/or be rendered non-functional.
You need to learn to distinguish between what Intel has actually stated vs. the facts as we wish them to be. In the previous post you reference, I affirmed your acknowledgement that the capability disappearing would be consistent with what Intel has actually said, to date.
In the post above, I was leaving open the possibility that *maybe* Intel is actually "cool" with there being a BIOS option to trade AVX-512 for E-cores. We simply don't know how Intel feels about that, because (to my knowledge) they haven't said.
When I clarify the facts as they stand, don't confuse that with my position on the facts as I wish them to be. I can simultaneously acknowledge one reality, which maintaining my own personal preference for a different reality.
This is exactly what happened with the ECC situation: I was clarifying Intel's practice, because your post indicated uncertainty about that fact. It was not meant to convey my personal preference, which I later added with a follow-on post.
Having to clarify this to an "Oxford Guy" seems a bit surprising, unless you meant like Oxford Mississippi.
> you mocked advocacy
It wasn't mocking. It was clarification. And your post seemed more to express befuddlement than expressive of advocacy. It's now clear that your post was a poorly-executed attempt at sarcasm.
Once again, it's helpful not to have your ego so wrapped up in your posts that you overreact when someone tries to offer a factual clarification.
> If I see more of the same preening and posing, I spare myself the rest of the nonsense.
Then I suggest you don't read your own posts.
I can see that you're highly resistant to reason and logic. Whenever I make a reasoned reply, you always hit back with some kind of vague meta-critique. If that's all you've got, it can be seen as nothing less than a concession.
I don't see AVX-512 a good solution. Current x64 chips are putting so much complexity in CPU with irrational clock speed that migrating process-node further into Intel4 on would be a nightmare once again.
I believe most of the companies with in-house developers expect the end of Xeon-era is quite near, as most of the heavy computational tasks are fully optimized for GPUs and that you don't want coal burning CPUs.
Even if it doesn't come in 5 year time-frame, there's a real threat and have to be ahead of time. After all, x86 already extended its life 10+ years when it could have been discontinued. Now it's really a dinosaur. If so, non-server applications would follow the route as well.
We want more simple / solid / robust base with scalability. Not an unreliable boost button that sometimes do the trick.
I don't see AVX-512 that negatively it is just the same as AVX2 but double the vectors size and a with a richer instruction set. I find it pretty cool to work with especially when you've written some libraries that can take advantage of it. As I wrote before, it looks like Golden cove got AVX-512 right based on what Ian and Andrei uncovered. 0 negative offset (e.g. running at full speed), power consumption not much more than AVX2, and it supports both FP16 and BP16 vectors! I think that's pretty darn good! I can work with that! Now I want my Sapphire rapids with 32 or 48 Golden cove P cores! No not fall 2022 i want it now! lol
> When you optimize code today (for pre Alder lake CPUs) to take advantage > of AVX-512 you need to write two paths (at least).
Ah, so your solution depends on application software changes, specifically requiring them to do more work. That's not viable for the timeframe of concern. And especially not if its successor is just going to add AVX-512 to the E-cores, within a year or so.
> There are many levels of AVX-512 support and effectively you need write customized > code for each specific CPUID
But you don't expect the capabilities to change as a function of which thread is running, or within a program's lifetime! What you're proposing is very different. You're proposing to change the ABI. That's a big deal!
> It is absolutely possible and it will come with time.
Or not. ARM's SVE is a much better solution.
> I think in the future P and E cores might have more than just AVX-512 that is different
On Linux, using AMX will require a thread to "enable" it. This is a little like what you're talking about. AMX is a big feature, though, and unlike anything else. I don't expect to start having to enable every new ISA extension I want to use, or query how many hyperthreads actually support - this becomes a mess when you start dealing with different libraries that have these requirements and limitations.
Intel's solution isn't great, but it's understandable and it works. And, in spite of it, they still delivered a really nice-performing CPU. I think it's great if technically astute users have/retain the option to trade E-cores for AVX-512 (via BIOS), but I think it's kicking a hornets nest to go down the path of having a CPU with asymmetrical capabilities among its cores.
Hopefully, Raptor Lake just adds AVX-512 to the E-cores and we can just let this issue fade into the mists of time, like other missteps Intel & others have made.
I too believe AVX-512 exclusion in the E cores it is transitory. next gen E cores may include it and the issue goes away for AVX-512 at least (Raptor Lake?). Still there will be other features that P have but E won't have so the scheduler needs to be adjusted for that. This will continue to evolve with every generation of E and P cores - because they are here to stay.
I read somewhere a few months ago but right now i do not remember where (maybe on Anandtech not sure) that the AVX-512 transistor budget is quite small (someone measured it on the die) so not really a big issue in terms of area.
AMX is interesting because where AVX-512 are 512 bit vectors, AMX is making that 512x512 bit matrices or tiles as intel calls it. Reading the spec on AMX you have BF16 tiles which is awesome if you're into neural nets. Of course gpus will still perform better with matrix calculations (multiplications) but the benefit with AMX is that you can keep both the general CPU code and the matrix specific code inside the CPU and can mix the code seamlessly and that's gonna be very cool - you cut out the latency between GPU and CPU (and no special GPU API's are needed). but of course you can still use the GPU when needed (sometimes it maybe faster to just do a matrix- matrix add for instance just inside the CPU with the AMX tiles) - more flexibility.
Anyway, I do think we will run into a similar issue with AMX as we have the AVX-512 on Alder Lake and therefore again the scheduler needs to become aware of each cores capabilities and each piece of code need to state what type of core they prefer to run on: AVX2, AVX-512, AMX capable core etc (the compliers job). This way the scheduler can do the best job possible with every thread. There will be some teething for a while but i think this is the direction it is going.
The difference is that AMX is new. It's also much more specialized, as you point out. But that means that they can place new hoops for code to jump through, in order to use it.
It's very hard to put a cat like AVX-512 back in the bag.
To be clear, I also want to add that the way code is written today (in my organization) pre Alder Lake code base. Every time we write a code path for AVX512 we need to write a fallback code path incase the CPU is not AVX-512 capable. This is standard (unless you can control the execution H/W 100% - i.e. the servers). Does not mean all code has to be duplicated but the inner loops where the 80%/20% rule (i.e. 20% of the code that consumes 80% of the time, which in my experience often becomes like the 99%/1% rule) comes into play that's where you write two code paths: 1 for AVX-512 in case it CPU is capable and 2 with just AVX2 in case CPU is not capable mostly this ends up being just as I said the inner most loops, and there are excellent broadly available templates to use for this. Just from a pure comp sci perspective it is quite interesting to vectorize code and see the benefits - pretty cool actually.
I'm not even going to say this is a bad idea. The problem is that it's a big change and Intel normally prepares the software developer community for big new ISA extensions a year+ in advance!
Again, what you're talking about is an ABI change, which is a big deal. Not only that, but to require code to handle dynamically switching between AVX2 and AVX-512 paths means that it can't use different datastructures for each codepath. It even breaks the task pre-emption model, since there need to be some limitations on where the code needs to have all its 512-bit registers flushed so it can handle switching to the AVX2 codepath (or vice versa).
This adds a lot of complexity to the software, and places a greater testing burden on software developers. All for (so far) one CPU. It just seems a bit much, and I'm sure a lot of software companies would just decide not to touch AVX-512 until things settle down.
My view on this topic is that Intel made a sound decision disabling AVX512. Some of the comments are framing it as if they made a mistake, because the tech community discovered it was still there, but I don't see any problem. Only, the wording was at fault, this controversial "fused off" statement. And actually, the board makers are at fault, too, enabling a hidden feature and causing more confusion.
On the question of whether it's desirable, allowing one core with the instructions and another without, would've been a recipe for disaster---and that, too, for heaven knows what gain. The simplest approach was bringing both cores onto the same footing. Indeed, I think this whole P/E paradigm is worthless, adding complexity for minimal gain.
Really? Our tech guys tried out Xeon Phi but couldn't make use of it. Years later, Xeon Phi was abruptly discontinued due to lack of demand. GPGPUs are much easier to handle.
Yeah, coding cost and risks aside, it's interesting to see complex work of art in the modern CPU. But I'd rather wish for expansion of GPU support (like shared memory and higher band-width).
My understanding is that Raptor Lake's change is replacing Golden Cover P cores with Raptor Cove P cores, doubling Gracemont E-Cores per SKU, and using the same Intel 7 process. Granted, it's all leaks at this point, but with Gracemont being reused for Raptor Lake, I don't expect AVX-512 next year either.
> Raptor Lake's change is ... doubling Gracemont E-Cores ... using the same Intel 7 process.
I was merely speculating that this *might* just be a transient problem. If they're using the same process node for Raptor Lake, which seems very plausible, then it's understandable if they don't want to increase the size or complexity of their E-cores.
However, there's some precedent, in the form of Knights Landing, where Intel bolted on dual AVX-512 pipelines + SMT4 to a Silvermont Atom core. And with a more mature Intel 7 node, perhaps the yield will support the additional area needed for just a single pipe + 512-bit registers. And let's not forget how Intel increased the width of Goldmont, yet simply referred to it as Goldmont+.
So, maybe Raptor Lake will use Gracemont+ cores that are augmented with AVX-512. We can hope.
A great comparison I would love to see just out of curiouslty would be to see P core only benchmarks and then e core only benchmarks! We could gain a much better understanding of the capabilities and performance of both . This would bring a little bit of familiarity back to benchmarking .
the only info provided was its on intels new process 7 node. what does that mean? are they using TSMC and at 7nm? or did they finally crack 7nm at Intel?
"Intel 7" is the process node formerly known as "10 nm ESF" (Enhanced SuperFin), which is the 4th generation 10 nm process, counting by the revisions they've introduced between the different products based on it. They like to pretend that Cannon Lake didn't happen, but that's why Ice Lake was actually 10 nm+ (2nd gen).
They rebranded 10 nm ESF as "Intel 7" for marketing reasons, as explained here:
While AL is an interesting CPU (regardless of what one's preference is), I still think the star of AL is the Gracemont core (E cores), and did some very simple-minded, back of a napkin calculations. The top AL has 8 (P cores with multithreading) = 16 + 8 E core threads (no multithreading here) for a total of 24 threads. According to first die shots, one P core requires the same die area as 4 E cores. That leaves me wanting an all-E core CPU with the same die size as the i9 AL, because that could fit 8x4= 32 plus the existing 8 Gracemonts, for a total of 40. And, the old problem of "Atoms can't do AVX and AVX2" is solved - because now they can! Yes, single thread performance would be significantly lower, but any workload that can take advantage of many threads should be at least as fast as on the i9. Anyone here knows if Intel is considering that? It wouldn't be the choice for gaming, but for productivity, it might give both the i9 and, possibly, the 5950x a run for the money.
Yes, those Tremont-based CPUs are intended/sold for 5G cell stations; I hope that Intel doesn't just refresh those with Gracemont, but makes a 32-40 Gracemont core CPU available for workstations and servers. The one thing that might prevent that is fear (Intel's) of cannibalizing their Sapphire Rapid sales. However, if I would be in their shoes, I'd worry more about upcoming AMD and multi-core ARM server chips, and sell all the CPUs they can.
Well, it's a start that Intel is already using these cores in *some* kind of server CPU, no? That suggests they already should have some server-grade RAS features built-in. So, it should be a fairly small step to use them in a high core count CPU to counter the Gravitons and Altras. I think they will, since it should be more competitive in terms of perf/W.
As for workstations, I think you'll need to find a workstation board with a server CPU socket. I doubt they'll be pushing massive E-core -only CPUs specifically for workstations, since workstation users also tend to care about single-thread performance.
Sorry but performance it isn't all +- a few percent in the real world will not restore confidence. Critical flaws, disabling functionality (dx12 in hanswell for example), instabbility instruction features etc. I cannot afford to trust such a company
I just wanted to add a big Kudos for this article. AnandTech's coverage of the 12900K was by a wide margin the best of any I read or watched, with regards to coverage of the various variables involved, and with the breadth and depth of testing. Thanks for keeping it up!
I don't know about "Power bi", but Tensorflow should run best on GPUs. Which CPU to get then depends on how many GPUs you're going to use. If >= 3, then Threadripper. Otherwise, go for Alder Lake or Ryzen 5000 series.
You'll probably find the best advice among user communities for those specific apps.
This is an amazingly deep, properly Anandtech review, even ignoring time constraints and the unusual difficulty of this particular launch. I bet Ian and Andrei will be catching up on sleep for weeks.
It’s disappointing that Anandtech continues to use suboptimal compilers for their platforms. Intel’s Compiler classic demonstrated 41% better performance than Clang 12.0.0 in the SPECrate 2017 Floating Point suite.
I think it's fair, though. Most workloads people run aren't built with vendor-supplied compilers, they use industry standards of gcc, clang, or msvc. And the point of benchmarks it to give you an idea of what the typical user experience would be.
But are they not compiling the code for the M1 series chips with a vendor supplied compiler?
Second, almost all benchmarks in SPECrate 2017 Floating Point are scientific codes, half of which are in Fortran. That’s exactly the target domain of the Intel compiler. I admit, I am out of date with the HPC developments, but back when I was still in the game icc was the most commonly used compiler.
> are they not compiling the code for the M1 series chips with a vendor supplied compiler?
It's just a slightly newer version of LLVM than what you'd get on Linux.
> almost all benchmarks in SPECrate 2017 Floating Point are scientific codes,
3 are rendering, animation, and image processing. Some of the others could fall more in the category of engineering than scientific, but whatever.
> half of which are in Fortran.
Only 3 are pure fortran. Another 4 are some mixture, but we don't know the relative amounts. They could literally link in BLAS or some FFT code for some trivial setup computation, and that would count as including fortran.
Without knowing what the Fortran code in the mixed code represents I would not discard it as irrelevant: those tests could very well spend a majority of their time executing Fortran.
As for the int tests, the advantage of the Intel compiler was even more pronounced: almost 50% over Clang. IMO this is too significant to ignore.
If I ran these tests, I would provide results from multiple compilers. I would also consult with the CPU vendors regarding the recommended compiler settings. Anandtech refuses to compile code with AVX512 support for non Alder Lake Intel chips, whereas Intel’s runs of SPECrate2017 enable that switch?
> At Intel’s Innovation event last week, we learned that the operating system > will de-emphasise any workload that is not in user focus.
I see performance critical for audio applications which need near-real time performance. It's already a pain to find good working drivers that do not allocate CPU core for too long, not to block processes with near-realtime demands. And for performance tuning we use already the Windows option to priotize for background processes, which gives the process scheduler a higher and fix time quantum, to be able to work more efficient on processes and to lower the number of context switches. And now we get this hybrid design where everything becomes out of control and you can only hope and pray, that the process scheduling will not be too bad. I am not amused about that and very skeptical, that this will work out well.
Do you know, for a fact, that the new scheduling policies override the priority-boost you mentioned? I wouldn't assume so, but I'm not saying they don't.
Maybe I'm optimistic, but I think MS is smart enough to know there are realtime services that don't necessarily have focus and wouldn't break that usage model.
Windows 11 scheduler fails to allocate workloads... I noticed that the scheduler parks the cores if the application isn't full screen. I did a test on a 12700k with Handbrake: as long as the program window remains in the foreground, all the Pcore and Ecore are allocated at 100%. If I open a browser and use it while the movie is being compressed, the kernel takes the load off the Pcore and runs the video compression only on the Ecores. Absurd behavior, absolutely useless!
I have my 12900K for a little less than a month now and here's what I've found from the testing that I've done with the CPU:
(Hardware notes/specs: Asus Z690 Prime-P D4 motherboard, 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM (128 GB total), running CentOS 7.7.1908 with the 5.14.15 kernel)
IF your workload CAN be multithreaded and it can run on BOTH the P cores AND the E cores simultaneously, then there is a potential that you can have better performance than the 5950X. BUT if you CAN'T run your application on both the P cores and the E cores at the same time (which a number of distributed parallel applications that rely on MPI), then you WON'T be able to realise the performance advantages that having both said P cores and E cores would give you (based on what the benchmark results show).
And if your program, further, cannot use HyperThreading (which some HPC/CAE program will actually lock you out of doing so), then you can be upwards of anywhere between 63-81% SLOWER than the 5950X (because on the 5950X, even with SMT disabled, you can still run the programme on all 16 physical cores, vs. the 8 P cores on the 12900K).
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
474 Comments
Back to Article
5j3rul3 - Thursday, November 4, 2021 - link
Great step for intelBobbyjones - Thursday, November 4, 2021 - link
Indeed. Biggest improvements since sandybridge. If you look at the timeline, this wouldve been the first CPU designed since they saw Zen 1. This is their Zen 1 moment and they already took the performance crown back basically across the board and at a lower price. AMD is now on the back foot, and it will be another whole year before Zen 4, and the thing is, Zen 4 isnt even competing with Alder Lake, Raptor Lake is rumored to be out before Zen 4. AMD has really screwed up with their launch cycle and given Intel so much room that they not only caught back up but beat them. Intel is truly back.Netmsm - Thursday, November 4, 2021 - link
For now Threadripper has the performance crown.With this performance per watt, Intel can just win the market for PCs.
Enterprise will never accept this performance per watt! So, AMD wins the high profitable enterprise market.
12900k guzzles power up to 241! whereas 5950x consumes half!
Considering power consumption, it's like a Pyrrhic victory for Intel.
fazalmajid - Thursday, November 4, 2021 - link
The HEDT market in Enterprise is workstations, which run certified apps like AutoCAD and has a lot of inertia. The first real Zen workstation is the Lenovo P620 and it only recently came out, so AMD hasn't conquered that market yet. Most actual Enterprise desktops are compact models that typically run on laptop CPUs.DominionSeraph - Friday, November 5, 2021 - link
And Intel has AMD beat for miles in system validation.My 3950X on a x570 Phantom Gaming X has major issues with disk access across one NVMe, one SATA SSD, and two HDDs. Some things will start up fine, but some things will just HANG. Deus Ex loading screens take like 10 seconds. I just tried to play a video off my NVMe and it took ~15 seconds for it to launch MPC-HC. (further launches are fine.) MeGUI takes 15 seconds to launch.
This thing is just frustratingly slow in general desktop tasks compared to my old i7 4790.
Does it beat the pants off the 4790 in heavily multithreaded crunching? Yes. But iAMD does not put out a quality product.
Gothmoth - Friday, November 5, 2021 - link
anecdotal evidence? ....YOU have issues with your system.well we have 16 core ryzen and threadripper 32 & 64 core systems at work and we can´t complain.
it´s not as if intel is issue free (and i am not taking about security flaws).
when you have such grave issues.. YOUR system has issues.
probably a bad setup. i did not hear that starting MPC needs 15 seconds when i read abourt AMD systems.
dotjaz - Sunday, November 7, 2021 - link
What about USB issues that are publicly acknowledged AND multiple BIOSes claim to have fixed it, yet here we are.Netmsm - Friday, November 5, 2021 - link
It is your problem not AMD nor Intel!This is why we always refer to QVL of MB before buying RAM, SSD, etc. to avoid such problems. It is not AMD prerogative rather it is for all platforms.
For now you may better update MB bios as soon as it is released. To solve the problem completely you need to reassemble it according to the MB's QVL.
DominionSeraph - Friday, November 5, 2021 - link
It is an AMD issue. I've put together hundreds of Intel systems and none of them have any issues.Netmsm - Friday, November 5, 2021 - link
When you face abnormality just put your cards on the table and ask a pro.ajollylife - Sunday, November 7, 2021 - link
I agree. I've got a 3995wx everything on qvl, even with an optane drive. Got too annoyed with the bugs and found a 5950x worked better for a high performance desktop. Going to swap to a 12900k once i can find parts.TheJian - Sunday, November 7, 2021 - link
If you know how to use mem timings, you idiots that depend on SPD's wouldn't have these problems (that covers about 90% of this crap, and knowing other bios settings solves almost anything else besides REAL failures). I've been building systems for decades (and owned a PC biz for 8yrs myself) and a MB's QVL list was barely used by anyone I know (perhaps to look up some ODD part but otherwise...Just not enough covered at launch etc). If I waited for my fav stuff to be included in each list I'd never build. Just buy top parts and you don't worry much about this crap.That said, if my job was on the line, I'd check the list, but not because I was worried about ever being wrong...LOL. I just don't have a liars face. I'd be laughing about how stupid I think it is after so many builds and seeing so many "incompatible memory" fixed in seconds in the hands of someone not afraid to disable the SPD and get to work (or hook up with a strap before blowing gigs of modules, nics repeatedly etc). Even mixing modules means nothing then (again, maybe if I was pitching servers...DUH....1 error can be millions) after just trying to make issues exists with mixing/matching but with timings CORRECT. No, they will work, if set correct barring some REAL electrical issue (like a PSU model from brand X frying a particular model mboard - say dozens in a weekend, a few myself!).
Too many DIY people out that that really have no business building a PC. No idea what ESD is (no just because it took a hit and still works doesn't mean it isn't damaged), A+ what?? Training? Pfft, it's just some screws and slots...Whatever...Said the guy with machine after machine that have never quite worked right...LOL. If you live in SF or some wet joint OK (leo leporte etc? still around), otherwise, just buy a dell/hp and call it a day. They exist because most of you are incapable of doing the job correctly, or god forbid troubleshooting ANYTHING that doesn't just WORK OOB.
Qasar - Sunday, November 7, 2021 - link
blah blah blah blah blahMidland_Dog - Saturday, November 27, 2021 - link
people like you cost amd salessilly amdumb
cyberpunx_r_ded - Friday, November 5, 2021 - link
sounds like a Mobo problem, not a CPU problem....for someone who has put together "hundreds of systems" you should know that by the symptoms.That motherboard is known to be dog sh1t btw.
DominionSeraph - Saturday, November 6, 2021 - link
Note Intel doesn't allow "dog sh1t motherboards" to happen, especially at the $300+ price point. That makes it an AMD issue.I can refurb Dell after Dell after Dell after Dell, all of them on low-end chipsets and still on the release BIOS, and they all work fabulously.
Meanwhile two years into x570 and AMD is still working on getting USB working right.
I think I'll put this thing on the market and see if I can recoup the better part of an i9 12900k build. I may have to drop down to one of the i7 6700's or the i7 4770k system I have until they're in stock, but that's really no issue.
Netmsm - Saturday, November 6, 2021 - link
It's a pleasure to not have p*gheaded amateurs in the AMD zone.Others are telling you it's not AMD issue but you spamming it's AMD, AMD, AMD... having got the wrong and of the stick.
Wrs - Saturday, November 6, 2021 - link
@Netmsm Regardless of whether the blame lies with ASRock for the above issue, it remains a fact that AMD didn't fix a USB connectivity problem in Zen 3 until 6-7 months after initial availability. Partly that was because the installed base of guinea pigs was constricted by limited product, but it goes to show that quick and widespread product rollouts have a better chance of ironing out the kinks. (Source if you've been under a rock heh https://www.anandtech.com/show/16554/amd-set-to-ro...And then recently we had Windows 11 performance regressions with Zen 3 cache and sandboxed security. These user experience hiccups suggest one company perceptibly lags the other in platform support. It's just something I've noticed switching between Intel and AMD. I might think this all to be normal were I loyal to one platform.
Netmsm - Sunday, November 7, 2021 - link
I didn't realize we're here to discuss minor issues/incompatibilities of the Intel's rival. I thought we're here to talk about major inefficiencies besides improvements of Intel's new architecture. Sorry!Wrs - Sunday, November 7, 2021 - link
@Netmsm That's no minor issue/incompatibility. Maybe for you, but a USB dropout is not trivial! Think missing keystrokes, stuttering audio for USB headsets and capture cards. It didn't affect every user, and was intermittent, which was part of the difficulty. I put off a Ryzen 5000 purchase for 2 months waiting for them to fix it. (I also put it off for 4 months before that because of lack of stock lol.)Netmsm - Monday, November 8, 2021 - link
@Wrs Assume we give you this, just drop AMD related things.Let's focus on performance per watt of new Intel Architecture. I asked some questions about this which is a pertinent topic to this article but you dodged. Just talk about 241 watt being sucked by 12900k :))
ajollylife - Sunday, November 7, 2021 - link
To be fair Intel's def not perfect. The i225v lan bug comes to mindDominionSeraph - Saturday, November 6, 2021 - link
When Intel doesn't have this issue it is AMD.Intel demands validation and spends hundreds of millions on engineers to help motherboard manufacturers fix their issues before release because they realize that the motherboard is an INSEPERABLE part of the equation. You can fanboi all you want over the CPU, but if there are no good motherboards you have a terrible system.
AMD just dumps their chip on the market, leaving the end-user with a minefield of motherboard issues among every manufacturer.
When Intel has the problem solved but AMD doesn't, it's an AMD issue.
Qasar - Saturday, November 6, 2021 - link
DominionSeraph have you tried this on a different x570 board ? it COULD be just that board, as was already stated. i have been running a 5900x now for about a week, and no issues at all with disk access, its the same as the 3900x i replaced, and im using an asus x570 e-gaming board.maybe take a step back, realize that its NOT an amd issue, but a BOARD issue. to be fair, intel has had its own issues over the years as well, while you see them as perfect, sorry to break it to you, but intel also isnt perfect
Netmsm - Sunday, November 7, 2021 - link
Oh I got it, the fact that you're encountering unusual issues, despite myriads of people who haven't complained of the same problem you have, means that AMD got many incompatibility issues.That's a brilliant reasoning which brings us to a pure conclusion: AMD is bad. We're done here.
Dug - Monday, November 15, 2021 - link
@DominionSeraph"Note Intel doesn't allow "dog sh1t motherboards" to happen, especially at the $300+ price point. That makes it an AMD issue."
You obviously haven't looked at or experienced the issues that plague many motherboards. Maybe just read some reviews or visit manufactures forums to get an idea of how bad it is before making such a naïve statement.
It has more to do with the manufacturer of the board than the CPU.
That is like saying nvidia doesn't make crap video cards. Well you need to look at all the failures by each manufacturer, not nvidia itself.
madseven7 - Saturday, November 6, 2021 - link
You put that chip in an ASROCK motherboard. That's the reason for the issues.xhris4747 - Tuesday, November 9, 2021 - link
Lol 😅FLORIDAMAN85 - Friday, November 12, 2021 - link
I, too, have made the error of purchasing an ASRock board.mode_13h - Saturday, November 13, 2021 - link
So far, my ASRock Rack server board is still going strong.ButIDontWantAUsername - Wednesday, November 10, 2021 - link
How's that validation with Denuvo going? Nothing like upgrading to Intel and having your games suddenly start crashing.Iketh - Tuesday, November 30, 2021 - link
please, no more comments from youtuxRoller - Friday, November 5, 2021 - link
Most desktops at enterprise companies could be replaced with terminals given that most of the people are really just performing data entry & retrieval. The network is the bit doing the work.For people who need old school workstations, then I agree, but that's a damn small (but high margin) market.
blanarahul - Thursday, November 4, 2021 - link
Alder Lake is extremely efficient when gaming - https://www.igorslab.de/en/intel-core-i9-12900kf-c...Scroll down and you'll find a graph detailing total gaming power consumption (CPU + GPU) and CPU power consumed per fps. In both metrics, Alder Lake is doing better than Zen 3 and much better than Rocket Lake.
PC World's review - https://www.pcworld.com/article/548999/12th-gen-co... - conveys that while 12900K goes volcanic in Cinebench, it sips power in a real world workload.
It seems like Alder Lake for desktop has been clocked way beyond its performance/watt sweet spot. It should be very interesting to compare Alder Lake for laptops v/s Zen 3 for laptops.
blanarahul - Thursday, November 4, 2021 - link
To give a short summary for (only) CPU power consumption v/s FPS when playing Horizon Zero Dawn11900K consumes 100 watts for 143 fps
5950X consumes 95 watts for 145 fps
5800X consumes 59 watts for 144 fps
12900K consumes 52 watts for 146 fps
12700K consume 43 (!) watts for 145 fps
Intel is very, very competent with AMD. Considering that 12700K has less E cores and consumes less power, I am very curious how it would do with all E cores disabled and running only on P cores.
Netmsm - Thursday, November 4, 2021 - link
Sounds like there is only gaming world!In PCs it may not be considered as a egregious blunder however you're right Intel is now competitive but to previous AMD's if and only if we wink at Intel's guzzling power.
Some examples from Tom's benches:
y-cruncher
12900k DDR5 consumes 197 watts whereas 5950x consumes 103 watts.
handbrake
12900k DDR5 consumes 224 watts whereas 5950x consumes 124 watts.
blender bmw27
12900k DDR5 consumes 205 watts whereas 5950x consumes 125 watts.
Will you calculate power efficiency, please?
geoxile - Thursday, November 4, 2021 - link
My 5950X uses 130-140W in y-cruncher. And @TweakPC on twitter tested lower PL1 and found the 12900k was only around 5% slower using 150W than 218W. Alderlake being power hungry is only because Intel is pushing 8 P-cores and 8 E-cores (collectively equal to around 4 P-cores according to Intel) to the limit, to compete against 16 Zen 3 cores. You can argue that it's still not as good as the 5950X but efficiency in this case is purely a problem of how much power Intel is allowing by defaultflyingpants265 - Thursday, November 4, 2021 - link
Because they need all that extra power to increase their performance a tiny bit. They're not just doing it for fun.Netmsm - Saturday, November 6, 2021 - link
Exactly 👍Netmsm - Thursday, November 4, 2021 - link
Even Ian has "accidentally" forgotten to put nominal TDP for 12900k in results =))All CPUs in "CUP Benchmark Performance: Intel vs AMD" are mentioned with their nominal TDP except 12900k.
It sounds there's some recommendations! How venal!
xhris4747 - Tuesday, November 9, 2021 - link
They should use pbo it's fair toxhris4747 - Tuesday, November 9, 2021 - link
Is you using pbo some people are t using pbo which I think isn't fair because that i9 is oc to snotEnglishMike - Thursday, November 4, 2021 - link
It's not just the gaming world -- it's the entire world except for long-running CPU intensive tasks. Handbrake and blender are valuable benchmarking tools for seeing what a CPU is capable of when pushed to the limit, but the vast majority of users -- even most power users -- don't do that.Sure, Intel has more work to do to improve power efficiency in long running CPU intensive workloads, but taking the worst case power usage scenarios distorts the picture as much as you're claiming the reviewers are doing.
Wrs - Thursday, November 4, 2021 - link
Can't calculate efficiency without scores. Also, well known that power scales much faster than performance. The proper way to compare efficiency is really at constant work rate or constant power.blanarahul - Thursday, November 4, 2021 - link
Sorry sir I can't. You haven't provided me the data for how much time each test took! Would you be so kind as to do that?Netmsm - Thursday, November 4, 2021 - link
Sorry, this is a direct link to Tom's bench:https://cdn.mos.cms.futurecdn.net/if3Lox9ZJBRxjbhr...
this is for "blender bmw27" in which both 12900k and 5950x finish the job around 80 seconds BUT 12900k sucks power for about 70 percent more than 5950x.
you can find other benches here:
https://www.tomshardware.com/news/intel-core-i9-12...
I'm wondering why Ian hasn't put 12900k nominal TDP in results just like all other CPU's! When 10900k was released with nominal TDP of 125, Ian put than number in every bench while in reality 10900k was consuming up to 254 (according to the Ian's review)! When I asked him to put real numbers of power consumption for every test he said I can't because of time and because I've too much to do and because I've no money to pay and delegate such works to an assistant!
But now we have 12900k with nominal TDP of 241 which seems unpleasant to Ian to put it in front of it in results.
Zingam - Friday, November 5, 2021 - link
Last gen game. How about glquake?1 billion computing devices and just a few million game units sold? What does it mean? Gamers are a tiny but vocal minority.
If they bring this performance at 5W on low and 45W on high then its good for majority of people. This is just a space heater.
Gothmoth - Friday, November 5, 2021 - link
so throwing more cores on a game that can´t make use of them is usless thanks for clarifing that.... genius!!when a 5600x is producing 144 FPS and a 5950x is producing 150 FPS the 5600x is the clear winner when it comes to efficency.
now try to cool the 12900K in a work environment with an air cooler.
i can cool my threadripper with a noctua aircooler and let it run under full load for ours.
i am really curious to see how the 12900k will handle that.
i am not an amd fanboy. i was using anti-consumer intel for a decade before switching to ryzen.
i would us intel again when it makes sense for me (i need my pc for work not gaming).
but with this power draw it does not make sense.
Wrs - Saturday, November 6, 2021 - link
The 12900k is fine with a Noctua D15 in a work environment. Doesn't matter if you're hammering it at 95C the whole time, the D15 doesn't get louder. But it's no megachip like a Threadripper. For that on the Intel side you'd wait for Sapphire Rapids or put up with an existing Xeon Gold with 8-32 Ice Lake cores at 10nm.Netmsm - Saturday, November 6, 2021 - link
How would it be justified to buy Xeon Gold in place of Threadripper and Epyc?!Wrs - Saturday, November 6, 2021 - link
@Netmsm I'll leave that to the market as I don't foresee using any of the 3 that soon lol. It would stand to reason that if one product is both cheaper and better, it would keep gaining share at the expense of the other. If that doesn't happen I would question the premise of cheaper + better. And seeing as it's a major market for Intel, I have little doubt they'll adjust prices if they do find themselves selling an inferior product.Netmsm - Sunday, November 7, 2021 - link
That's right. We always check performance per watt and per dollar. A product should be reasonable with respect to its price and power consumption, this is a must.12900k can consume up to 241 which is very closer to Threadripper not Ryzen 5900's TDP and yet competing with chips having 125 TDP! What a parody this is!
I can't disregard and throw away efficiency factor, that's all.
Spunjji - Friday, November 5, 2021 - link
Seeing this has made me very interested to see the value proposition Alder Lake will be offering in gaming notebooks. I was vaguely planning to switch up to a Zen 3+ offering for my next system, but this might be enough to make me reconsider.EnglishMike - Thursday, November 4, 2021 - link
<blockquote>re: Enterprise: Considering power consumption, it's like a Pyrrhic victory for Intel.</blockquote>Why? This is not an enterprise solution -- that's the upcoming Sapphire Rapids Xeon processors, a completely different CPU platform.
Sure, if all you're doing is pegging desktop CPUs at 100% for video processing or a similar workload, then Alder Lake isn't for you, but the gaming benchmarks clearly show that when it comes to more typical desktop workloads, the i9 12900k is inline with the top of the line AMD processors in terms of power consumption.
Netmsm - Thursday, November 4, 2021 - link
and who in his right mind would believe that upcoming Xeon processors can bring revolutionary breakthrough in power consumption?!EnglishMike - Friday, November 5, 2021 - link
And that, my friend, is a great example of moving the goalposts.We'll have to see what Intel offers re: Xeon's but one thing is for sure, they're going to offer a completely different power profile to their flagship desktop CPUs, because that's the nature of the datacenter business.
Netmsm - Saturday, November 6, 2021 - link
Of course the nature of enterprise won't accept this power consumption. In PC world customers may not care how ineffective a processor is. Intel will reduce the power consumption but the matter is how its processor will accomplish the job! We see an unacceptable performance to watt in Intel's new architecture that needs something like a miracle for Xeon's to become competitive with Epyc's.Wrs - Saturday, November 6, 2021 - link
No miracle is needed... just go down the frequency-voltage curve. Existing Ice Lake Xeons already do that. What's new about Sapphire Rapids is not so much the process tech (it's still 10nm) but the much larger silicon area enabled per package due to the EMIB packaging. That's their plan to be competitive with Epyc and its multichip modules.Netmsm - Sunday, November 7, 2021 - link
And what will happen to performance as frequency-voltage curve goes down?Just look at facts! With about 100w more power consumption Intel's new architecture gets itself in front of Zen 3 by a slight margin in some cases that lucidly tells us it can never reduce power consumption and yet beat Epyc in performance.
Wrs - Sunday, November 7, 2021 - link
@Netmsm I'm looking at facts. The process nodes are very similar. One side has both a bigger/wider core (Golden Cove) and a really small core (Gracemont). The other side just has the intermediate size core (Zen 3). As a result, on some benchmarks one side wins by a fair bit, and on other benchmarks, the other side takes the cake. Many benches are a tossup.In this case the side that theoretically wins on efficiency at iso-throughput (MC performance) is the side that devotes more total silicon to the cores & cache. When comparing a 12900k to a 5950x, the latter has slightly more area across the CCDs, about 140 mm2 versus around 120 mm2. The side that's more efficient at iso-latency (ST/lightly threaded) is the one that devotes more silicon to their largest/preferred cores, which obviously here is ADL. In practice companies don't release their designs at iso-performance, and for throughput benchmarks one may encounter memory and other platform bottlenecks. But Intel seems to have aggressively clocked Golden Cove such that it's impossible for AMD to reach iso-latency with Zen 3 no matter the power input (i.e., you'd have to downclock the ADL). That has significant end-user implications as not everything can be split into more threads.
The Epyc Rome SKUs are already downclocked relative to Vermeer, like most server/workstation CPUs. Epyc Rome tops out at 64 Zen3 cores across 8 chiplets. Sapphire Rapids, which isn't out yet, has engineering samples topping out at 80 Golden Cove cores across 4 ~400mm2 chiplets. Given what we know about relative core sizes, which side is devoting more silicon to cores? There's your answer to performance at iso-efficiency. That's not to say it's fair to compare a product a year out vs. one you can obtain now, but also I don't see a Zen4 or N5 AMD server CPU within the next year.
Netmsm - Sunday, November 7, 2021 - link
I believe, we're not talking about ISO-efficiency or manufacturing or engineering details as facts! These are facts but in the appropriate discussion. Here, we have results. These results are produced by all those technological efforts. In fact, those theoretical improvements are getting concluded in these pragmatical information. Therefore, we should NOT wink at performance per watt in RESULTS - not ISO-related matters.So, the fact, my friend, is Intel new architecture does tend to suck 70-80 percent more power and give 50-60 percent more heat. Just by overclocking 100MHz 12900k jumps from ~80-85 to 100 degrees centigrade while consuming ~300 watts.
Once in past, AMD tried to get ahead of Nvidia by 6990 in performance because they coveted the most powerful graphic card title. AMD made the hottest and the noisiest graphic card in the history and now Intel is mimicking :))
One can argue that it is natural when you cannot stop or catch a rival so try to do some chicaneries. As it is very clear that Anandtech deliberately does not tend to put even the nominal TDP of Intel 12900k in their benches. I loathe this iniquitous practice!
Wrs - Sunday, November 7, 2021 - link
@Netmsm I believe the mistake is construing performance-per-watt (PPW) of a consumer chip as indicative of PPW for a future server chip based on the same core. Consumer chips are typically optimized for performance-per-area (PPA) because consumers want snappiness and they are afraid of high purchase costs while simultaneously caring much less than datacenters about cost of electricity.Netmsm - Monday, November 8, 2021 - link
@Wrs You cannot totally separate efficiency of consumer and enterprise chips!As an incontrovertible fact, architecture is what primarily (not completely) determines the efficacy of a processor.
Is Intel going to kit out upcoming server CPUs in an improved architecture?
Wrs - Monday, November 8, 2021 - link
@Netmsm Architecture, process, and configuration all can heavily impact efficiency/PPW. I’m not aware of any architectural reason that Golden Cove would be much less efficient. It’s a mildly larger core, but it doesn’t have outrageous pipelining or execution imbalances. It derives from a lineage of reasonably efficient cores, and they had to be as they remained on aging 14nm. Processwise Intel 7 isn’t much less efficient than TSMC N7, either. (It could even be more efficient, but analysis hasn’t been precise enough to tell.) But clearly ADL in a 12900/12700k is set up to be inefficient yet performant at high load by virtue of high frequency/voltage scaling and thermal density. I could do almost the same on a dual CCD Ryzen, before running into AM4 socket limits. That’s obviously not how either company approaches server chips.Netmsm - Tuesday, November 9, 2021 - link
When you cannot infer or appraise or guess we should drop it for now and wait for real tests of upcoming server chips to come.regards ^_^
GamingRiggz - Tuesday, March 15, 2022 - link
Thankfully you are no engineer.AbRASiON - Thursday, November 4, 2021 - link
AMD would have less of an issue If the 5000 processors weren’t originally priced gouged.Many people held off switching teams due to that. Instead of the processor being an amazing must buy, it was just a decent purchase. So they waited.
If you’re On the back foot in this game, you should be competing hard always to get that stranglehold and mind share.
I’m glad they’re competing though and hopefully they release some very competitive and REASONABLY PRICED products in the near future.
Fataliity - Thursday, November 4, 2021 - link
Their revenue and marketshare #'s beg to disagree.Spunjji - Friday, November 5, 2021 - link
They've been selling every CPU they can make. There are shortages of every Zen 3 based notebook out there (to the extent that some OEMs have cancelled certain models) and they're selling so many products based on the desktop chiplets that Threadripper 5000 simply isn't a thing. You ought to factor that into your assessment of how they're doing.BillBear - Thursday, November 4, 2021 - link
Is anyone gullible enough to forget more than a decade of price gouging, low core counts and nearly nonexistent performance increases we got from Intel, vs. the high core counts, increasing performance, and lower prices we got from AMD?Zzzoom - Thursday, November 4, 2021 - link
You're gullible enough to forget that AMD raised its margins as soon as it got the lead with Zen 3.lejeczek - Thursday, November 4, 2021 - link
And you are ready! to convince everybody... that whole freaking plandemic & communists mafia had nothing to do with prices gone up across the board. Good man!Spunjji - Friday, November 5, 2021 - link
"plandemic"🙄
"communists mafia"
🤦♂️
Qasar - Friday, November 5, 2021 - link
zzzoom, so in other words, intel kept raising its prices when they had the lead, but its NOT ok for amd to raise its prices when they have the lead ? so who is gullible ?amd had the right to raise its prices, after all intel did it.
madseven7 - Saturday, November 6, 2021 - link
You're gullible enough to forget that Intel raised prices for every generation of cpu's and chipsets.karmapop - Thursday, November 4, 2021 - link
This is a market economy. Neither company cares about your emotional attachments or misgivings beyond what is profitable for them. AMD as the market underdog played up that position heavily, gaining significant goodwill with the enthusiast consumer market. However as Zzzoom mentioned just as is expected as soon as they retook the performance dominant position their aggressive pricing strategy evaporated.If you're going to criticize Intel's market stagnation via mismangement for a decade you can't just ignore the fiasco of AMD's awful Bulldozer architecture and the 4.5 year gap between the launch of Piledriver and the launch of Zen 1. It's not unreasonable to make the argument that because Intel absolutely needed AMD to remain around at that time to avoid facing anti-trust issues, the lack of any real competitive alternative is a factor in their decision to stagnate as just 'greed'.
yeeeeman - Thursday, November 4, 2021 - link
AMD has been doing the same starting with Zen 3, so spare me with this...deathBOB - Thursday, November 4, 2021 - link
And they should be punished for correcting those problems?heickelrrx - Thursday, November 4, 2021 - link
AMD did since they make FX series so badStop blaming Intel alon for market segmentation AMD being not competitive also part of it
Spunjji - Friday, November 5, 2021 - link
FX series was as bad as it was for a couple of reasons - partly because AMD were starved of funding during the entire Athlon 64 era, and partly because Global Foundries utterly failed to develop their fabrication processes to be suitable for high-performance CPUs.Wrs - Saturday, November 6, 2021 - link
Nah, they just weren't that competitive. Athlon64 was decent (lot of credit to Jim Keller) but didn't let AMD take massive advantage of Intel's weakness during the Pentium 4 era because AMD fabs were capacity limited. Once Conroe came out mid 2006 the margins dried up rapidly and AMD had no good response and suffered a talent exodus. It's true Intel made it worse with exclusivity bonuses, but I think AMD's spiral toward selling their fabs would have happened anyway. No way they were going to catch up with tick-tock and Intel's wallet.GeoffreyA - Monday, November 8, 2021 - link
I've always felt the K10 wasn't aggressive enough, owing to AMD not having factored Conroe into their equations when K10 was designed. Then, like startled folk, they tried to take back the lead by a drastic departure in the form of Bulldozer; and that, as we know, sank them into the ditch. Nonetheless, I'm glad they went through the pain of Bulldozer: Zen wouldn't have been as good otherwise.mode_13h - Tuesday, November 9, 2021 - link
> FX series was as bad as it was for a couple of reasonsI thought I also heard they switched from full-custom layout to ASIC flow (maybe for the sake of APUs?). If so, that definitely left some performance on the table.
bunnyfubbles - Thursday, November 4, 2021 - link
3D v-cache will be out before Zen 4 and should help close the gap if not regain the overall lead on the high end. The problem for AMD is the competition below the i9 vs R9 realm, where the E cores really pull more than their weight and help the i9 compete with the R9s in multi, but for the i5s and i7s vs their R5 and R7 counterparts, its even-Steven with performance cores, then you have the E cores as the trump card.MDD1963 - Thursday, November 4, 2021 - link
If AMD gains an averge of ~10% in gaming FPS with the 3D cache onslaught, that should put them right back near the top...certainly much closer to the 12900K....geoxile - Thursday, November 4, 2021 - link
15% on average. 25% at the highest. Intel really should have offered a 16 P-core die for desktop smdh, classic intel blunderSpunjji - Friday, November 5, 2021 - link
That would be a hell of a large die and necessitate a total redesign of the on-chip fabric. I don't think it would really make any sense at all.RSAUser - Monday, November 8, 2021 - link
12900K is already huge, each performance core is the size of about 4 E cores, going 16C P-Core would probably mean a 70% die size increase, and then you run into core to core communication issues, AMD got around it with infinity fabric but that's why you have the higher latency access between cores in different core complexes and Intel gives a more consistent access time on higher end products. Intel's current cores are mosly ringbus, so travel from one core to the next, getting to 16 doesn't scale well, they used a mesh topology in some Skylake CPU's, that latency was too high and hampered performance badly, you'd run into that same issue with 16C.That's without checking into yield, getting 16C on one wafer that are all perfectly clocking high is going to be a very, very rare chip; AMD gets around it using the core complexes (CX) of 4 cores each, together into a CCD (core chiplet die) and then in Zen 3 (5000 series) is supposedly 8C CCX, which makes rare chips 8C if full ccx works well, else 6C if 2 can't make it turns into a 5600X.
StevoLincolnite - Friday, November 5, 2021 - link
AMD has an answer before Zen 4.And that is Zen 3 with V-Cache.
Spunjji - Friday, November 5, 2021 - link
"This is their Zen 1 moment"Indeed!
"at a lower price"
Not really, if you take platform into account (and you have to!)
"Zen 4 isnt even competing with Alder Lake, Raptor Lake is rumored to be out before Zen 4"
Potentially, but Zen 4 is a bigger jump from Zen 3 than Raptor is predicted to be from Alder. Raptor will have more E cores but it's on the same process, so it's likely to offer better perf/watt in multithreading but unlikely to increase overall performance substantially (unless they allow maximum power draw to increase).
"AMD has really screwed up with their launch cycle"
Not really? They're still competitive in both price/performance (accounting for platform cost) and perf/watt. Zen 3D should shore up that position well enough.
"Intel is truly back"
Yup!
adamxpeter - Friday, November 5, 2021 - link
Very poetic post.bananaforscale - Friday, November 5, 2021 - link
Seems we're actually getting a Zen 3 refresh early next year. Alder Lake's lead also decreases with DDR4, gaming above 1080p (so basically anyone who would buy a 12900K for a gaming rig), it uses more power and with DDR5 you pay extra for memory.Yeah, Alder Lake has some advantages. Not sure I'd call it a better overall package at the moment.
madseven7 - Saturday, November 6, 2021 - link
Intel is back at the cost of power. AMD at that power will destroy Intel. Intel basically said screw TDP.Qasar - Saturday, November 6, 2021 - link
intel has been saying that for 2-3 years now, its the only way their chips can be competitive with zen 2 and 3Maverick009 - Sunday, November 7, 2021 - link
They really haven't screwed up as you would like to think. I do believe AMD was thrown off some by the unexpected performance in Hybrid design. They still do trade blows between some games, multi-threaded software, and on applications that are just not optimized for Alder Lake.What I have noticed though in the days since Alder Lake's NDA went up and reviews came out, is leaks to AMD's next gen Zen CPUs have begun to trinkle out a little more than usual. Yes we have Zen 4 on the way, which will pave the way for DDR5 and PCIe Gen5 along with an uplift in IPC. However the real secret sauce may be in Zen 4D as the platform to build a heavily multi-threaded core package along with SMT enabled, and then Zen 5. The big picture, is AMD's version of a Hybrid CPU may include a combination of Zen 4D big cores and Zen 5 Bigger cores. The Zen 4D are said to possibly carry as many as 16 cores per chiplet, too, so it would speak to a possible heavily multi-threaded efficient CPU, while sacrificing a little bit of single threaded performance to achieve it. The timeframe would also put the new Hybrid CPU on a collision course to battle Raptor Lake.
For once the CPU market has gotten interesting again, and the consumer ultimately wins here.
NikosD - Monday, November 8, 2021 - link
@reviewersSince AVX-512 is working on ADL, it would be useful to test the AVX-512 vs AVX2 power consumption of ADL by running POVRAY using P-cores only and compare that maximum AVX2 power consumption to AVX-512 max power consumption using 3DPM.
Because max 272W power consumption of POVRAY as reported, includes 48W from E-cores too.
mode_13h - Tuesday, November 9, 2021 - link
> it would be useful to test the AVX-512 vs AVX2 power consumption of ADL by running POVRAYI'm not sure of POV-Ray is the best way to stress AVX-512.
NikosD - Wednesday, November 10, 2021 - link
They have already tested max power consumption of AVX-512 using 3DPM.I just asked to test POVRAY using P-cores only, for max power consumption of AVX2 in order to compare with 3DPM.
usernametaken76 - Monday, November 8, 2021 - link
lolxhris4747 - Tuesday, November 9, 2021 - link
They did not take the performance crown gaming is almost tied overall mt is a mixed bag hopefully they use pbo which gives about 27k-30k on c23blanarahul - Thursday, November 4, 2021 - link
"Using all the eight E-cores, at 3.9 GHz, brings the package power up to 48 W total."This sounds amazing for inexpensive (i3 class) laptop processors since Gracemont sips power and doesn't take much die space.
Great_Scott - Thursday, November 4, 2021 - link
I'd actually prefer a all-Gracemont CPU for Laptops. Seems like it would be better for intentionally maximizing battery life. Skylake+ level performance is perfect for most use cases.TheinsanegamerN - Thursday, November 4, 2021 - link
Indeed, they have it backwards for laptops, it should be 2-6 gracemont cores then 1-2 power cores for a CPU, not the other way around.karmapop - Thursday, November 4, 2021 - link
I'm guessing you missed the articles describing the two separate mobile dies for Alder Lake? We've got Alder Lake-P (6P + 8E) for performance mobile designs, and Alder Lake-M (2P + 8E) for the ultra mobile low power SKUs.at_clucks - Saturday, November 6, 2021 - link
I'm very happy with exactly-Skylake-level performance in my desktop :). I'd more than gladly take the same performance and cut the power in half. I'm sure there's quite a big market for that kind of performance in a lower powered package regardless of form factor (mobile, desktop).Meteor2 - Tuesday, November 9, 2021 - link
There really is. I may well pick up a 2P+8E ADL laptop, but a desktop box would suit me bettermode_13h - Wednesday, November 10, 2021 - link
Keep an eye on ASRock. They sell mini-ITX motherboards with that class of SoC.https://www.asrock.com/mb/index.us.asp#Intel%20CPU
Spunjji - Friday, November 5, 2021 - link
That's actually not great in power terms compared to what AMD can do with 8 Zen 3 cores on TSMC N7 - but yeah, in the context of die area, something built around (say) 2 P cores and 4 E cores can probably put in a very good showing for inexpensive devices.Netmsm - Thursday, November 4, 2021 - link
Becomes competitive to previous AMD's.EnglishMike - Thursday, November 4, 2021 - link
Previous AMDs support DDR5 and PCR 5.0?Huh. That one slipped by me...
MDD1963 - Thursday, November 4, 2021 - link
Not sure what statement was actually intended there....jerrylzy - Friday, November 5, 2021 - link
Even with DDR5 and PCIe 5.0 Intel is only barely competitive with AMD on the high end, and that's pretty sad. Only 12600K is competitive.Spunjji - Friday, November 5, 2021 - link
PCIe 5.0 is currently useless to consumers and likely to be so for the duration of ADL's life. DDR5 is currently far more expensive and doesn't provide a compelling performance benefit for most users.So, yeah - just as Comet Lake was a reasonable alternative to Zen 2 and 3 for many users despite being stuck on PCIe 3.0, so ADL doesn't really make a compelling argument for itself just by having PCIe 5.0.
NikosD - Friday, November 5, 2021 - link
@reviewersAny sign of AMX silicon inside P-cores ?
As another Intel's surprise like AVX-512, I mean.
cc2onouui - Friday, November 5, 2021 - link
Good performance with DDR5 but this review is less than complete (after you started "intel vs AMD ddr4 was not used for the tests) that's odd because DRAM price and MB makes the pick..it's sad that an important review with too much effort (now more than 17000 words)looks only on the shallow.. is windows 10 used for gaming tests or 11 and why ddr4 is out so the not good enough 2080ti5j3rul3 - Thursday, November 4, 2021 - link
Is there any advanced analysis between TSMC N7 and Intel 7?grahaman27 - Thursday, November 4, 2021 - link
Both are marketing terms that mean nothing. There.Wrs - Thursday, November 4, 2021 - link
But they happen to be surprisingly close. Can't go wrong with one over the other.Spunjji - Friday, November 5, 2021 - link
They are both marketing terms, yes, but they both refer to actual processes that can be compared.Fataliity - Thursday, November 4, 2021 - link
Der8auer had a video at... Kleindeck i think? Where they analyzed the transistors of Intel 10nm vs AMD 7nm processor. Essentially they are almost equal.Spunjji - Friday, November 5, 2021 - link
N7 is a little more dense than Intel's 10nm-class process - 15-20% in comparable product lines (e.g. Renoir vs. Ice Lake, Lakefield vs. Zen 3 compute chiplet). There is no indication that Intel 7 is more dense than previous iterations of 10nm. N7 also appears to have better power characteristics.It's difficult to tell, though, because Intel are pushing much harder on clock speeds than AMD and have a wider core design, both of which would increase power draw even on an identical process.
Blastdoor - Thursday, November 4, 2021 - link
I’m a little surprised by the low level of attention to performance/watt in this review. ArsTechnica gave a bit more info in that regard, and Alder Lake looks terrible on performance/watt.If Intel had achieved this performance with similar efficiency to AMD I would have bought Intel stock today.
But the efficiency numbers here are truly awful. I can see why this is being released as an enthusiast desktop processor -- that's the market where performance/watt matters least. In the mobile and data center markets (ie, the Big markets), these efficiency numbers are deal breakers. AMD appears to have nothing to fear from Intel in the markets that matter most.
meacupla - Thursday, November 4, 2021 - link
Yeah, the power consumption of 12900K is quite bad.From other reviews, it's pretty clear that highest end air cooling is not enough for 12900K, and you will need a thick 280mm or 360mm water cooler to keep 12900K cool.
Ian Cutress - Thursday, November 4, 2021 - link
I think there are some issues with temperature readings on ADL. A lot of software showcases 100C with only 3 P-cores loaded, but even with all cores loaded, the CPU doesn't de-clock at that temp. My MSI AIO has a temperature display, and it only showed 75C at load. I've got questions out in a few places - I think Intel switched some of the thermal monitoring stuff inside and people are polling the wrong things. Other press are showing 100C quite easily too. I'm asking MSI how their AIO had 75C at load, but I'm still waiting on an answer. An ASUS rep said that 75-80C should be normal under load. So why everything is saying 100C I have no idea.Blastdoor - Thursday, November 4, 2021 - link
Note that the ArsTechnica review looks at power draw from the wall, so unaffected by sensor issues.jamesjones44 - Thursday, November 4, 2021 - link
They also show the 5900x somehow drawing more power than a 5950x at full load. While I'm sure Intel is drawing more power, I question their testing methods given we know there is very little chance of a 5950x fully loaded drawing less than a 5900x unless they won or lost the CPU lottery.TheinsanegamerN - Thursday, November 4, 2021 - link
techspot and TPU also show that, and it has been explained before that the 5950x gets the premium dies and runs at a lower core voltage then the 5900x, thus it pulls less power despite having more cores.haukionkannel - Thursday, November 4, 2021 - link
5950x use better chips than 5900x... that is the reason for power usage!vegemeister - Saturday, November 6, 2021 - link
5950X can hit the current limit when all cores are loaded, so the power consumption folds back.meacupla - Thursday, November 4, 2021 - link
75C reading from the AIO, presumably a reading from the base plate, is quite hot, I must say.Spunjji - Friday, November 5, 2021 - link
I thought much the same.blanarahul - Thursday, November 4, 2021 - link
"ArsTechnica gave a bit more info in that regard, and Alder Lake looks terrible on performance/watt."I was suspicious that this is the reason Intel finally went hybrid on mainstream. Golden Cove can have horrible perf/watt since Gracemont exists for low power scenarios.
Maxiking - Thursday, November 4, 2021 - link
listening to arse technica in 2k21 lolmichael2k - Thursday, November 4, 2021 - link
It's data. Do you just discount data?The Garden Variety - Thursday, November 4, 2021 - link
Well, you at least have to appreciate that Maxiking saved significant time and effort by typing "2k21" instead of "2021". Attention to efficiency is something we can all respect and admire in MMXXI.m53 - Thursday, November 4, 2021 - link
[Intel 12th gen consumes less power in gaming across the board vs Ryzen 5000 series](https://www.reddit.com/r/intel/comments/qmw9fl/why... [Even the Multi threaded perf per watt is also better for 12900K compared to 5900X](https://twitter.com/capframex/status/1456244849477... It is only specific cases where 12900k need to beat 5950x in multi threaded loads it needs to crank up more power. But for typical users Intel is both the perf /watt and perf /dollar champion.Bobbyjones - Thursday, November 4, 2021 - link
Until you look at the gaming power consumption and realize Intel is beating AMD in efficiency in games and general use. Check igorslab's review. Its only in the highly threaded workstation applications like blender or synthetics that use 100% of load that Intel starts using quite a bit of power. But 99% of users will never do those, all they care about is gaming, browsing, video, etc.Blastdoor - Thursday, November 4, 2021 - link
So if 99% of users don’t need multiple cores, I guess intel made a huge mistake in including them. They could have just made a dual core processor and “99%” of users would have been just fine.I think it’s HILARIOUS that people are arguing that the efficiency of this thing is just fine so long as you don’t actually fully utilize it.
Hulk - Thursday, November 4, 2021 - link
You mean like how I can drive my Civic in a sane manner and get 40mpg or hammer it and get 20mpg? Push the CPU (or automobile) out of it's efficient zone and it becomes less efficient. You can do the same thing with Zen 3 CPU's. They get a little faster and use a lot more power. Same as Intel CPU's.jerrylzy - Friday, November 5, 2021 - link
12900K is no Civic. More like a Ferrari. If you never push that Ferrari, why buy it? Buy a Civic then (12600)?Spunjji - Friday, November 5, 2021 - link
This. People often get halfway through the analogy and then give up when they think it's made their argument for them.Dribble - Sunday, November 7, 2021 - link
The having lots of potential power and high power consuption is exactly what mobile phones and laptop cpu's do. That Intel do that in desktops too is not surprising.Spunjji - Friday, November 5, 2021 - link
99% of users don't need a 12900K. Presumably the people who do are likely to use it for these tasks where it will actually show a performance improvement over a cheaper CPU (accepting that some people overspend for e-peen reasons and will buy one for gaming where a 12600K would do just as well).lmcd - Friday, November 5, 2021 - link
99.9999999999% of users don't need a 12900K peak performance constantly, even if they will use the peak performance sometimes, including times when it definitely counts.I won't lie and say I have the best of the best, but Zen 2 vs Zen 1 cut down my build times noticeably. That helps keep me in flow, even if it's only saving me a few minutes per day. For people like me with ADHD or other attention-related issues, this can be a massive boon.
brucethemoose - Thursday, November 4, 2021 - link
Does efficiency really matter for top end desktop SKUs? Intel/AMD tend to clock these near their voltage walls, WAY outside the "sweet spot" of a given architecture, and you can get a boatload of efficiency back just dropping boost clocks by 10% yourself.Now, if the laptop SKUs end up being power hungry, thats a different story.
Blastdoor - Thursday, November 4, 2021 - link
Same core design, same process. So.... I'm sure Intel will lower clocks for mobile and servers to get power usage down, but once they lower the clocks, how will the performance compare?meacupla - Thursday, November 4, 2021 - link
For now, efficiency doesn't matter for desktops, but in a few years time, we are very likely to see laws passed that will mandate high efficiency in high end desktops.There are already some legislation in the works that calls for exactly this, but have not been passed yet.
TheinsanegamerN - Thursday, November 4, 2021 - link
And how, pray tell, are they going to legislate that? Max power usage for a CPU? We've already seen how california tried it, and predictably they made a mess of it.INB4 intel just refuses to sell anything but a celeront o californians and mysteriously tech resellers in arizona get a bunch of cali orders. Hmmmm.....
meacupla - Thursday, November 4, 2021 - link
don't ask me, IDK how law makers will do it. Just be aware that there are some really dumb laws that are already in existence, and the world is going to be entering an age of power shortages, along with carbon neutral incentives.Considering how things are going currently, I think it'll just be a 100% tax on desktop CPUs that can't hit some efficiency metric that Apple has designed.
Wrs - Thursday, November 4, 2021 - link
Doubtful given how poorly the existing law works. All they do is measure computer idle wattage. The lawmakers aren't techies. And they're busy handling the blowback from carbon neutrality bills that the pubic believes are related to power shortages and cost spikes.michael2k - Thursday, November 4, 2021 - link
One is a bellwether for the other.Mobile parts will have cores and clocks slashed to hit mobile power levels; 7W-45W with 2p2e - 6p8e
However, given that a single P core in the desktop variant can burn 78W in POV Ray, and they want 6 of them in a mobile part under 45W, that means a lot of restrictions apply.
Even 8 E cores, per this review, clock in at 48W!
That suggests a 6p8e part can't be anywhere near the desktop part's 5.2GHz/3.9GHz Turbo clocks. If there is a linear power-clock relationship (no change in voltage) then 8 E cores at 3GHz will be the norm. 6 P cores on POV-Ray burn 197W, then to hit 45W would mean throttling all 6 cores to 1.2GHz
https://hothardware.com/news/intel-alder-lake-p-mo...
siuol11 - Thursday, November 4, 2021 - link
Except that we know that the power-clock ratio is not linear and never has been. You can drop a few hundred MHz off of any Intel chip for the past 5 generations and get a much better performance per watt ratio. This is why mobile chips don't lose a lot of MHz compared to desktop chips.michael2k - Thursday, November 4, 2021 - link
We already know their existing Ice Lake 10nm 4C mobile parts are capped at 1.2GHz to hit 10W:https://www.anandtech.com/show/15657/intels-new-si...
A 6p8e part might not clock that low, but I'm certain that they will have to for the theoretical 7W parts.
Here's a better 10nm data point showing off their 15W-28W designs:
https://www.anandtech.com/show/14664/testing-intel...
4C 2.3GHz 28W TDP
Suggests that a 4pNe part might be similar while the 6p8e part would probably be a 2.3GHz part that could turbo up to a single core to 4GHz or all cores to 3.6GHz
TheinsanegamerN - Thursday, November 4, 2021 - link
Yes, once it gets in the way of performance, and intel's horrible efficiency means you need high end water cooling to keep it running, whereas AMD does not. Intel's inneficiency is going to be an issue for those who like air cooling, which is a lot of the market.Wrs - Thursday, November 4, 2021 - link
Trouble is I'm not seeing "horrible efficiency" in these benchmarks. The 12900k is merely pushed far up the curve in some of these benches - if the Zen3 parts could be pushed that far up, efficiency would likewise drop quite a bit faster than performance goes up. Some people already do that. PBO on the 5900x does up to about 220W (varies on the cooler).jerrylzy - Friday, November 5, 2021 - link
PBO is garbage. You can restrict EDC to 140A, let loose other restrictions and achieve a better performance than setting EDC to 220A.Spunjji - Friday, November 5, 2021 - link
"if the Zen3 parts could be pushed that far up"But you wouldn't, because you'd get barely any more performance for increased power draw. This is a decision Intel made for the default shipping configuration and it needs to be acknowledged as such.
Wrs - Saturday, November 6, 2021 - link
As a typical purchaser of K chips the default shipping configuration holds rather little weight. A single BIOS switch (PBO on AMD, MTP on Intel), or one slight change to Windows power settings, is pretty much all the efficiency difference between 5950x and 12900k. It pains me every time I see a reviewer or reader fail to realize that. The chips trade blows on the various benches because they're so similar in efficiency, yet each by their design has strong advantages in certain commonplace scenarios.Spunjji - Friday, November 5, 2021 - link
If the competition are able to offer similar performance and you don't have to shell out the cash and space for a 360mm AIO to get it, that's a relevant advantage. If those things don't bother you then it's fine, though - but we're in a situation where AMD's best is much more power efficient than Intel's at full load, albeit Intel appears to reverse that at lower loads.geoxile - Thursday, November 4, 2021 - link
Clock/power scales geometrically. The 5900HS retains ~85% of the 5800X's performance while using 35-40W stable power vs 110-120W for the 5800X. That's almost 3x more efficient. Intel is clocking desktop ADL to the moon, it doesn't mean ADL is going to scale down poorly, if anything I expect it to scale down very well since the E-cores are very performant while using a fraction of the power and according to Intel can operate at lower voltages than the P-cores can, so they can scale down even lower than big cores like ADL P-cores and zen 3. ADL mobile should be way more interesting than ADL desktop.michael2k - Thursday, November 4, 2021 - link
Clock/power scales linearly.It's only Voltage/power that scales geometrically
If you have to increase voltage to increase clock then you can say clock/power is geometric.
So if at a fixed voltage you can go from 2GHz -> 2.5GHz the power usage only goes up by 25%
If you also bump voltage up, however, from 1.2V -> 1.4V, power usage might go up 36%, so that combined you would see a 61% increase in power usage to hit 2.5GHz.
Great_Scott - Thursday, November 4, 2021 - link
Race to Sleep - still a good idea for desktop usage patterns.factual - Thursday, November 4, 2021 - link
Alder is actually more efficient than 5950X in most real world scenarios. PC World did a proper real world power consumption comparison and Alder Lake was superior in most cases. Unless AMD cuts prices dramatically, in makes zero sense to buy Ryzen at this moment ... unless you are a mindless fanboi that is!!meacupla - Thursday, November 4, 2021 - link
I am waiting for AMD to cut prices on their 5600X or 5800X, so I can upgrade from my 2700X.eddman - Thursday, November 4, 2021 - link
Exactly. I'm looking for the same with my 2600.Spunjji - Friday, November 5, 2021 - link
"Unless AMD cuts prices dramatically, in makes zero sense to buy Ryzen at this moment ... unless you are a mindless fanboi that is!!"Or if, you know, you pay attention to how much a whole system costs and make a decision based on that instead of assuming cheap CPU = cheap system?
madseven7 - Saturday, November 6, 2021 - link
Mindless??!! Why?? Cause I can buy a Ryzen 5000 cpu to drop into my current motherboard to replace my Zen+ cpu (2700x). Mindless cause I refuse to pay $750 for 12900k, $450+ for Asus mb not even the best middle of the road, $330 for 32gb of ddr5 and this doesn't even include the aio360 cooler needed. You do the math.Wrs - Saturday, November 6, 2021 - link
What puzzles me is why you haven't already dropped a Zen 3 in that. Zen 3 is basically end of the line for that board. I do not know if "Zen 3+" with vertical cache will even come out, much less be available and affordable for someone who shuns ddr5 costs.AlyxSharkBite - Thursday, November 4, 2021 - link
My argument would be anyone looking at performance per watt on a CPU like this is a bit crazy. I've never considered that important for a CPU you're going to run well out of spec with big cooling anyway.I'm far more interested in perf per watt on the mobile version. That's where it's going to matter as you can't just throw more cooling at a laptop. Especially compared to the Ryzen chips that have extremely good perf per watt.
EnglishMike - Thursday, November 4, 2021 - link
I don't think you have to be crazy, you just have to be one of those few for whom it matters -- i.e. those who execute long-running high-CPU load workloads on a regular basis.Otherwise, yeah, it's mostly irrelevant, given the performance per watt of more typical workloads -- even gaming -- are pretty much inline with the equivalent Ryzen CPUs.
AnonymousGuy - Thursday, November 4, 2021 - link
Fanbois clinging to any metric they can find where AMD is maybe better. Surprised they aren't raving about the color of the box or something."OMG the power consumption is too high what ever am I going to do with my 1000W power supply and 720mm rad?!" - GTFO!
Shorty_ - Thursday, November 4, 2021 - link
I assume you weren't one of the people going "sure Vega can match Pascal in some places but the power efficiency is horrific!"?People care about it when it's not their product being inefficient. :)
Spunjji - Friday, November 5, 2021 - link
Odd reversal of the actual logic, whereby any situation where Intel is worse gets ignored...eddman - Thursday, November 4, 2021 - link
I'm disappointed by power consumption figures from almost all outlets. Intel usually pushes the i9 parts to the voltage and frequency wall, meaning the power consumption would obviously be bad at max possible clocks.https://twitter.com/TweakPC/status/145596164928275...
I'm not saying I necessarily trust this source, but I don't see a reason to think it's fake. You can have ~92% of the performance for ~68% of the power.
eddman - Thursday, November 4, 2021 - link
Just to clarify, I'm disappointed by the depth of power consumption research in reviews, not what the max numbers are.ricebunny - Thursday, November 4, 2021 - link
Me too. Check out igorslab review. It turns out that the i9 12900 is more power frugal than a 5950x when gaming.jospoortvliet - Friday, November 5, 2021 - link
Right, the audience that cares least about efficiency gets it while business work is inefficient.eddman - Friday, November 5, 2021 - link
That's not what I meant. I wanted to see tests at different power limits to see how it does in performance/watt metrics against zen 3 at the same limits.Spunjji - Friday, November 5, 2021 - link
That's fair, although it is a big ask for reviewers who are already covering multiple OS and RAM configsmode_13h - Friday, November 5, 2021 - link
Yeah, it's not as if there can be only one article written about this CPU!More depth can be added in subsequent articles, as often happens.
web2dot0 - Thursday, November 4, 2021 - link
Who in their right mind thinks Intel is moving in the right direction?250W TDP?!?!?
Apple came out with their 30W TDP that can rival desktop CPU parts.
Comically embarrassing.
geoxile - Thursday, November 4, 2021 - link
Apple rivals desktop CPUs in SPECINT, which clearly loves memory bandwidth and cache. DDR5 alone boosted ADL's score in SPECINT MT by 33% from a DDR4 configuration. In Cinebench and geekbench the m1 pro and max are closer to workstation laptop processors. We'll see what happens with ADL mobile.Ppietra - Thursday, November 4, 2021 - link
The 12600K has basically the same Geekbench score of the M1 Max, and yet its 10 cores consume 3 times more than the M1.On the 12900K just using the 8 E-cores consumes more than the M1 Max using the CPU at peak power. So we shouldn’t expect big miracles in mobile, unless Intel starts selling 90W chips.
As for Cinebench, it will be difficult for Apple Silicon to come out on top until Apple implements some sort of Hyperthreading, Cinebench takes good advantage from it.
geoxile - Thursday, November 4, 2021 - link
The H55 segment will offer 8+8 at 45W and H45 will offer 6+8 at 35W, no need to compare the 12600k. We have models for how mobile uses power compared to desktop. They retain 80-90% of the performance at 1/3 to 1/4 the sustained power. 5900HS @ 35W cTDP (35-40W actual power) has around 85% the performance of the 5800X @110-120W in cinebench. The 11980HK at 45W has almost 90% the performance of the 11700k at 130-150W (non-AVX) in geekbench 5.Ppietra - Thursday, November 4, 2021 - link
Closer to 15% drop in Geekbench, and probably at much higher package peak power draw than 45W, considering what Anandtech has measured for the 11980HK in Multithreaded tasks (around 75W).geoxile - Thursday, November 4, 2021 - link
The 11980HK respects the configured PL/cTDP for the most part. It only hits 75W during the initial cold start. It uses 65W sustained power when configured to PL 65 and 45W when configured to 45Whttps://www.anandtech.com/show/16680/tiger-lake-h-...
I screwed up using tom's results for geekbench, apparently it is at PL 65 unlike Anand's for the TGL test system. But it also scores 9254 vs anandtech's 11700k scoring 9853, so within around 94% performance of its desktop counterpart. I've seen some higher scores on GB itself but using "official" sources that's pretty close to 2x more efficient. I can't seem to find any real PL 45 results for GB5. Point is, scaling down isn't a problem, and ADL will no doubt scale down better thanks to E-cores and just overall better efficiency based on what we've already seen, like gaming efficiency according to igorslab and PL 150 making barely any difference in performance compared to PL 220. I think Intel is in a unique position since AMD doesn't have small cores anymore.
Ppietra - Friday, November 5, 2021 - link
What you are failing to realize is that Geekbench, due to its short tests nature, ends up spending a lot of time at peak performance and not at sustained performance.And no, the 11700k doesn’t score 9853 - you are looking at averages on the Geekbench site which are not reliable to make this sort of comparison. Notebookcheck geekbench score is close to 11300, while the 11980HK scores closer to 9700.
geoxile - Friday, November 5, 2021 - link
Geekbench runs for a few minutes afaik. The peak you're describing only lasts for a split second and quickly falls down to the sustained power over a few seconds to 30 seconds. And no, I'm not looking at averages from geekbench, I literally told you I'm using anand's score for the 11700k and tom's score for TGL mobile. https://www.anandtech.com/bench/CPU-2020/2972Ppietra - Friday, November 5, 2021 - link
geoxile, Geekbench is a bunch of discreet tests with pauses in between.The value that you used is almost exactly the average in the Geekbench database, and we know that the 11700 gets much higher than that. You can also check that Anandtech never showed Geekbench results with that CPU in any of its reviews of the 11700. Don’t know why that value is there.
geoxile - Friday, November 5, 2021 - link
Describing a context switch to load the next bench as "pauses" is borderline gaslighting. It's a memory workload, not idling. PL2 on the 11980HK lasts for seconds from cold start at PL1 45.It's almost or it's exact. Anandtech lists those scores and I have no reason to doubt they copied them or made them up. Tom's has slightly higher scores at 10253 @ stock. That's a 4% variance, probably due to tom's using DDR4 3600 with tuned timings while anandtech used DDR4 3200. It's only with a 5Ghz OC toms can even break through 11000, let alone score 11300.
https://www.tomshardware.com/reviews/intel-core-i7...
Ppietra - Friday, November 5, 2021 - link
It’s not context switch, Geekbench deliberately pauses between tests to avoid throttling. Read about it.Notebookcheck didn’t make them up either and you can see higher scores inside Geekbench database.
Spunjji - Friday, November 5, 2021 - link
11980HK to 11700K isn't a useful analogue - it's 10SF vs. 14++++ and the caches on TGL-H45 are large than RKL.I'd be comfortable predicting that ADL mobile will be ~15% faster than TGL all-round, with better gains in power-limited multithreading scenarios where the E cores are properly utilised.
geoxile - Friday, November 5, 2021 - link
That's a valid point. But we can still look to the zen 3 APUs vs the desktop 5800X and see similar or better perf/W scaling. Based on what we've seen so far ADL is very comparable to zen 3 in efficiency in heavy synthetic loads when set to optimal PL (e.g 150W) and far, far more efficient in mixed loads like gaming, where a 12900k with PL 241 uses 60-70% the power of a stock 5950X. These are good signs."Apples to Apples", like 8 TGL cores vs 8 mixed ADL cores I'd agree. But the leaked configurations are 8+8 for 45W (up to 55W cTDP), or 6+8 at 35-45W. I think e-cores will make a huge difference.
Wrs - Thursday, November 4, 2021 - link
*raises hand* You can restrict the TDP of any Intel/AMD consumer processor. Or you can raise it, subject to physical/overclocking limits. It's user choice. I never complain when they're giving us choice.Process node advancement is in the right direction. In terms of efficiency, Intel is one full node behind the leading edge, which Apple basically has exclusively. No other high-volume chip is comparable to N5, even the Snapdragon 888 (though Samsung calls it 5nm).
Bobbyjones - Thursday, November 4, 2021 - link
Huge improvements for Intel, beats Zen 3 soundly in performance almost across the board. The 12900k is a beast. However its the 12600k that is the real champ, half the price of a 5800x and it still beats it.No surprise that 12th gen is sold out everywhere online, it seems like the Zen 3/AMD era is dead and Intel is back.
Spunjji - Friday, November 5, 2021 - link
"it seems like the Zen 3/AMD era is dead"No need to overstate the case
mode_13h - Friday, November 5, 2021 - link
> it seems like the Zen 3/AMD era is deadPremature. We don't yet have real world performance data on their V-Cache version.
kobblestown - Thursday, November 4, 2021 - link
I find the argument for disabling AVX512 really not convincing. If a process is running on an E core and reaches nonexisiting instruction it traps into the OS. The OS can determine that it's an instruction that can be executed on a P core and reschedule it there, keeping note to not move that process/thread back to an E core. It shouldn't have been too hard.SarahKerrigan - Thursday, November 4, 2021 - link
AVX512 implies a lot more than ops - it implies a whole set of state as well. Trap-and-move would be very non-trivial.kobblestown - Thursday, November 4, 2021 - link
Citation please.mode_13h - Friday, November 5, 2021 - link
It basically comes down to a context-switch. And those take a couple microseconds (i.e. many thousands of CPU cycles), last I checked. And that assumes there's a P-core available to run the thread. If not, you're potentially going to have to wait a few timeslices (often 1 -10 ms).Now, consider the case of some software that assumes all cores are AVX-512 capable. This would be basically all AVX-512 software written to date, because we've never had a hybrid one, or even the suggestion from Intel that we might need to worry about such a thing. So, the software spawns 1 thread per hyperthread (i.e. 24 threads on the i9-12900K) but can only run 16 of them at any time. That's going to result in a performance slowdown, especially when you account for all the fault-handling and context-switching that happens whenever any of these threads tries to run on an E-core. You'd basically end up thrashing the E-cores, burning a lot of power and getting no real work done on them.
mode_13h - Friday, November 5, 2021 - link
Forgot to address the case where the OS blocks the thread from running on the E-core, again.So, if we think about how worker threads are used to split up bigger tasks, you really want to have no more worker threads than actual CPU resources that can execute them. You don't want a bunch of worker threads all fighting to run on a smaller number of cores.
So, even the solution of having the OS block those threads from running on the E-cores would yield lower performance than if the the app knew how many AVX-512 capable cores there were and spawned only that many worker threads. However, you have to keep in mind that whether some function uses AVX-512 is not apparent to a software developer. It might even do this dynamically, based on whether AVX-512 is detected, but this detection often happens at startup and then the hardware support is presumed to be invariant. So, it's problematic to dump the problem in the application developer's lap.
eastcoast_pete - Thursday, November 4, 2021 - link
Plus, enabling AVX-512 on the big Cores would have meant having it on the E (Gracemont) cores also, or switching workloads from P to E cores on the fly won't "fly". And having AVX-512 in Gracemont would have interfered with the whole idea of Gracemonts being low-power and small footprint on the die. I actually find what Ian and Andrei did here quite interesting: if AVX-512 can really speed up whatever you want to do, disable the Gracemonts and run AL in Cove only. If that could be a supported option with a quick restart, it might be worthwhile under the right circumstances.AntonErtl - Friday, November 5, 2021 - link
There is no relevant AVX-512 state before the first AVX-512 instruction is executed. So trapping and switching to a P-core is entirely doable. Switching back would probably be a bigger problem, but one probably does not want to do that anyway.Spunjji - Friday, November 5, 2021 - link
Possible problem: how would you account for a scenario where the gain from AVX-512 is smaller than the gain from running additional threads on E cores? Especially when some processors have a greater proportion of E cores to P cores than others. That could get quite complicated.TeXWiller - Friday, November 5, 2021 - link
If you look at the Intel's prerelease presentation about Thread Director carefully, you see they are indeed talking about moving the integer (likely control) sections of AVX threads to E-cores and back as needed.kobblestown - Friday, November 5, 2021 - link
I'll reply to my comment because it seems the original one was not understood.When you have an AVX512-using thread on a P thread, it might happen that it needs to be suspended, say, because the CPU is overloaded. Then the whole CPU state is saved to memory so the execution can later be resumed as if nothing has happened. In particular, it may be rescheduled on another core when its time for it run again. If that new core is a P core, then we're safe. But if it's an E core, it might happen that we hit an AVX512 instruction. Obviously, the core cannot execute it so it traps into the OS. The OS can check what was the offending instruction and determine that the problem is not the instruction, but the core. So it moves it back to a P core, stores a flag that this thread should not be rescheduled on an E-core and keeps chugging.
Now, someone suggested that there might be a problem with the CPU state. And, indeed, you can not restore the AVX512 part of the state on an E core. But it cannot get changed by an E core either, because at the first attempt to do it it will trap. So the AVX512 part of the state that was saved on a P core is still correct.
Since this isn't being done, there might be (but not "must be" - intel, like AMD, will only do what is good for them, not what is good for us) some problem. One being that an AVX512 thread will never be rescheduled on an E core even if it executes a single AVX512 instruction. But it's still better than the current situation which postpones the wider adoption of AVX512 yet again. I mean, the transistors are already there!
factual - Thursday, November 4, 2021 - link
Great win for consumers! AMD will need to cut prices dramatically to be competitive otherwise Intel will dominate until Zen4 comes out!kobblestown - Friday, November 5, 2021 - link
Let's first see Zen3D early next year. It will let me keep my investment into the AM4 platform yet offer top notch performance.Spunjji - Friday, November 5, 2021 - link
"AMD will need to cut prices dramatically"Not until Intel's platform costs drop. Nobody's buying an ADL CPU by itself.
Maxiking - Thursday, November 4, 2021 - link
Let's be real, no one expected anything else, the first time Intel can use a different node that still has its problems and AMD is embarrassed and slower again.Lol, once Intel starts making their GPUs using TSMC, AMD back to being slowest there too.
LOL what a pathetic company
factual - Thursday, November 4, 2021 - link
Why disparage AMD !? the harder these companies compete, the better for consumers! Stop being a mindless fanboi of any company and start thinking rationally and become a fan of your own pocketbook !Fulljack - Thursday, November 4, 2021 - link
so, get this, you're fine with Intel stuck giving you max 4c/8t for nearly 6 years? really?Maxiking - Thursday, November 4, 2021 - link
You must be from the certain sub called AMD.I bought a 6 six for 390 USD in 2014, a 5820k.
Cry more. LOL
Maxiking - Thursday, November 4, 2021 - link
*a 6 coreTheinsanegamerN - Thursday, November 4, 2021 - link
Don't forget their 6 core 980x in 2009."But but but it wasn't a CONSUMER (read: cheap enough) CPU!!!!!!"
- THE AMDrones who forget that AMD sold a 6 core phenom that lost game benchmarks to a dual core i3 and then spent the next 5 years selling fake octo core chips.
Spunjji - Friday, November 5, 2021 - link
"But but but it wasn't a CONSUMER (read: cheap enough) CPU!!!!!!"Correct, there's a difference between a CPU that needs a consumer-level dual-channel memory platform and a workstation-grade triple-channel (or greater) platform. It doesn't sound so absurd if you don't strawman it with emotive nonsense and/or pretending that Used prices are the same as New.
I don't think anybody's forgotten how much better Intel's CPUs were from Core 2 all the way through to Skylake. The fact remains that when AMD returned to competition, they did so by opening up entirely new performance categories in desktop and notebook chips that Intel had previously restricted to HEDT hardware. I don't really understand why Intel fanbots like Maxiking feel so pressed about it, but they do, and it's kind of amusing to see them externalise it.
Qasar - Friday, November 5, 2021 - link
too bad that was hedt, not mainstream, thats where your beloved intel stagnated the cpu market, maxipadking, maybe you should cry more. intels non hedt lineup was so confusing i gave up trying to decide which to get, so i just grabbed a 5830k and an asus x99 deluxe, and was done woth it.lmcd - Friday, November 5, 2021 - link
The current street prices of $600+ for high-end desktop CPUs are comparable to HEDT prices. Let's face it, the HEDT market is underserved right now as a cost-saving measure (make them spend more for bandwidth) and not because the segmentation was bad.In sum, my opinion is that ignoring the HEDT line doesn't make much sense because segmenting the market was a good thing for consumers. Most people didn't end up needing the advantages that the platform provided before that era got excised with Windows 11 (unrelated rant not included). That provided a cost savings.
Qasar - Friday, November 5, 2021 - link
" The current street prices of $600+ for high-end desktop CPUs are comparable to HEDT prices " i didnt say current, as you saw, i mentioned X99, which was what 2014, that came out ?" Let's face it, the HEDT market is undeserved right now as a cost-saving measure " more like intel cant compete in that space due to ( possibly ) not being able to make a chip that big with acceptable yields so they just havent bothered. maybe that is also why their desktop cpus maxed out at 8 cores before gen 12, as they couldn't make anything bigger.
" doesn't make much sense because segmenting the market was a good thing for consumers " oh but it is bad, intel has pretty much always had way to many cpus for sale, i spent i think 2 days trying to decide which cpu to get back then, and it gave me a head ache. i would of been fine with z97 i think it was with the I/O and pce lanes, but trying to figure out which cpu, is where the headache came from, the prices were so close between each that i gave up, spent a little more, and picked up x99 and a 5930k( i said 5830k above, but meant 5930k) and that system lasted me until early last year.
Spunjji - Friday, November 5, 2021 - link
"Cry more. LOL"Who put 50p in the dickhead?
Seriously though, the thread's packed full of fanbots determined to exaggerate and posture.
Bagheera - Wednesday, November 10, 2021 - link
you must be the loser from wccftech naked "Clown Sh*tter* hahahahaopinali - Thursday, November 4, 2021 - link
What a pathetic attempt at trolling. Not sure if you noticed but Ryzen CPUs actually win lots of the game benchmarks, ties lots more; and many of the ADL wins are only with the very top CPU with DDR5. In several games even the 5800X beats ADL (even against DDR5). Zen3 is now a full year old, no v-cache yet, the next refresh which is coming soon will probably beat ADL across the board (still without DDR5). Granted, Intel still dominates anything that makes heavy use of AVX-512, which is... almost nothing, you can count'em on one hand's fingers.Considering the current price of DDR5, even for a brand-new system where you have to buy everything including the RAM, a top-end ADL system is a pretty bad value right now. But thanks to this release the price of Zen3 CPUs is going further down, I can now find a 4900X for $480 on stockx, that's a good discount below MSRP (thanks Intel! since I've been waiting that to upgrade from my 5600X). That's also the same street price I find today for the 12700K; the 12900K is through the roof, it's all out of stock in places like newegg, or $1.5K where I found stock although the KF is much less bad.
Also thanks to all the Intel fans that will burn cash in the first generation of DDR5 (overpriced and also with poor timings) so when Zen4 ships, 1y from today, DDR5 should be affordable and more mature, idem for PCIE5, so we Ryzen users can upgrade smoothly.
opinali - Thursday, November 4, 2021 - link
(I meant 5900X above, damn typo.)DannyH246 - Thursday, November 4, 2021 - link
Don't waste your time responding, you can't account for abject stupidity. This is the absolute best CPU Intel could possibly build. Ok, it beats AMD by a couple percent in single threaded, but loses by a higher margin in multithreaded while consuming twice the power. Shortly, AMD will easily regain the performance crown with v-cache, while we wait for Zen 4. Sadly another poor review by www.IntelTech.com. Nobody wants a room heater for a CPU.EnglishMike - Thursday, November 4, 2021 - link
Last I looked, the vast majority of Anandtech readers don't run long-lasting 100% CPU multithreaded workloads, which is the only scenario where this one CPU falls a long way behind in power consumption.Competition is good, and Intel has a competitive CPU on its hands, after a long time (for them) without one, and the reviews reflect that fact.
Spunjji - Friday, November 5, 2021 - link
^ This.mode_13h - Friday, November 5, 2021 - link
> the vast majority of Anandtech readers don't run> long-lasting 100% CPU multithreaded workloads
How many of us are software developers? My project currently takes about 90 minutes to build on a Skylake i7, and the build is fully multithreaded. I'm looking forward to an upgrade!
Wrs - Thursday, November 4, 2021 - link
I'll point out that the Anand review uses JEDEC standard RAM timings. For DDR5 that's not terrible today, but for DDR4 it is. I mean, DDR4-3200 CL20?? A sister site (Toms) used commonplace DDR4 latencies (3200 CL14) and found it superior to DDR5 (using JEDEC B latencies) for gaming and most tasks, as well as putting ADL comfortably ahead of Zen3 in games. A further BIOS setting they made sure of was to allow ADL to sustain turbo power. Not sure how much that affected results. To be fair I did not hear them enabling PBO on Zen 3, which would be the comparable feature.But for now I wouldn't be assuming that Ryzen CPUs win even the majority of games, and I absolutely wouldn't assume ADL needs DDR5 to reach its game potential. Most of these reviews out are preliminary, given a short window of time between product sample and NDA lifting.
Oxford Guy - Friday, November 5, 2021 - link
CL22 I think I read, not 20.Regardless, it’s ridiculously high.
Farfolomew - Thursday, November 4, 2021 - link
You mention that you were able to enable AVX-512. Did you have to disable the E-Cores for that option to appear, or was that option available regardless of enabling/disabling the E-Cores? If it is the latter, was the system stable?maroon1 - Thursday, November 4, 2021 - link
Why use windows 10 for gaming performance ??Maxiking - Thursday, November 4, 2021 - link
The very same reason they use a 2080Ti.They are incompetent and poor.
mkaibear - Thursday, November 4, 2021 - link
>Why use windows 10 for gaming performance ??>The very same reason they use a 2080Ti.
Because, you pair of dinguses, they want to be able to compare apples-with-apples.
So they benchmark with a defined OS and peripherals to minimise the number of things which change from run to run which means you can directly compare using their benchmarking results CPUs from previous generations.
If you update the GPU as well then all you are doing is benchmarking different combos of GPUs and CPUs and you never end up with stuff which is directly comparable.
If you want simple gaming benchmarks there are any number of websites which will give you that. But for those of us who care about proper benchmarking and directly comparable results, Anandtech always delivers.
And far from being incompetent, doing benchmarking properly requires a lot of work over many hours and requires maintaining the suite and making sure everything keeps ticking along properly.
In short; naff off back to your little troll holes and stop complaining about things that you know nothing of, k?
Bagheera - Wednesday, November 10, 2021 - link
yep, Clown Sh*tter from wccftech confirmed hahahahahaDahak - Thursday, November 4, 2021 - link
In regards to the AVX situation, I think it would be best to assume that they will be fused off, as possibly some made it out with the ability to enable it hence the obfuscation in the uefi firmware unless already knownAs a couple of things that might be interesting to look at / verify
1 - is it only review samples vs retail?
2 - is it on i9s vs i7s, assuming the i5's would not have it anyway
LibraEmbers - Thursday, November 4, 2021 - link
No 1440p and 720p low benchmarks this time around?maroon1 - Thursday, November 4, 2021 - link
No windows 11 gaming benchmarkIan Cutress - Thursday, November 4, 2021 - link
They're in Bench: www.anandtech.com/benchiSeptimus - Thursday, November 4, 2021 - link
Intel wins! /while pulling close to 300w... but sshhhh.Maxiking - Thursday, November 4, 2021 - link
Dude, my gpu consumes 450W. Why do you think we care? A 2600 non x owner spotted!FreckledTrout - Thursday, November 4, 2021 - link
Both of you are being extreme. Power draw does matter but it isn't the only metric.Bagheera - Wednesday, November 10, 2021 - link
Clown Sh*tter from wccftech spotted!factual - Thursday, November 4, 2021 - link
Not that power consumption matters for a high-end desktop CPUs but Alder Lake platform is more efficient than Zen4 platform in most real world scenarios (read PC World's analysis). So, if you are worried about power consumption that much you should steer away from Ryzen!! I guess your only choice is paying the premium for M1 Max's closed system !Honestly, fanbois should stop making ignorant comments about power consumption and "TDP" (lol) just to find an excuse to attack a good product because it's not from the company they mindlessly worship ... SMH!
factual - Thursday, November 4, 2021 - link
Meant to say Zen3 not Zen4!Oxford Guy - Friday, November 5, 2021 - link
‘Not that power consumption matters for a high-end desktop CPUs’Yes, tinnitus is ‘real in’.
Spunjji - Friday, November 5, 2021 - link
"Honestly, fanbois should stop making ignorant comments about power consumption and "TDP" (lol) just to find an excuse to attack a good product because it's not from the company they mindlessly worship ... SMH!"This applies just as equally to everybody saying TDP doesn't matter as it does to people pretending that the full-load power of a 12900K is representative of all ADL chips under common loads.
lemurbutton - Thursday, November 4, 2021 - link
Who still cares about Intel vs AMD?It's Intel + AMD vs Apple.
FreckledTrout - Thursday, November 4, 2021 - link
Only in laptops.photovirus - Thursday, November 4, 2021 - link
If nitpicking, yes.Still, M1 Pro and M1 Max compare quite nicely to desktop systems with these CPUs, especially in memory-intensive tasks.
So I wouldn't dismiss Apple's chips that easily.
EnglishMike - Thursday, November 4, 2021 - link
Apple's CPU performance and performance/watt are impressive, but it's going to take a lot more than that to make Intel/AMD start quaking in their boots, and that's not going to happen as long as Apple remains solely a vertical integrator of premium priced computers.If anything, Apple's recent advances will only galvanize AMD and Intel's CPU designers now they can see what can be achieved, and how.
michael2k - Thursday, November 4, 2021 - link
As long as Apple monopolizes TSMC’s leading edge nodes it really doesn’t matter how much Intel tries until they can get I4 online.Right now Intel can’t beat TSMC’s N5 or N5P process and AMD can’t afford either. On the flip side that means AMD can’t afford to design a better CPU because they’re also stuck on N7 and N7P.
Ppietra - Friday, November 5, 2021 - link
people are focusing too much on nodes! Apple’s node advantage over AMD isn’t that big in terms of what efficiency you get out of it. AMD is already using the N7+ node in some of its processors, and that puts it just around 10% behind the N5 node used by the M1 Max in performance per watt.michael2k - Thursday, November 4, 2021 - link
For now. Apple has more desktops using the M1P/M1M incoming.It's astounding to consider that a 60W part is competitive at all with a 300W part:
https://www.anandtech.com/show/17047/the-intel-12t...
vs
https://www.anandtech.com/show/17024/apple-m1-max-...
Go from 8p2e to 16p4e and power only goes up to 120W and the M1 scores could double, 106 SPECint2017_r and 162 SPECfp2017_r barring complexity due to memory bus overhead, CPU bus/fabric communication overhead, etc, since it's clear that the rate-n test performs far better when paired with DDR5 vs DDR4
Ppietra - Thursday, November 4, 2021 - link
Actually the M1 Max is a 43W part at CPU peak power, not 60W (60 was for the all machine).So when Apple doubles the cores it would be closer to 85W. 170W when using 4 times the cores, which will almost certainly happen.
That would mean that Apple could easily have more than double the performance at almost half the power consumption.
roknonce - Thursday, November 4, 2021 - link
It's not true that 12900k must use 300w, in fact, they can get over 90% performance with 150w. If you set voltage manually, you can get a P-core @ 3.2Ghz + E-core @2.4Ghz within 35w (Source: GeekerWan). Its Cinebench R23 score is ST1350, MT14k. What about M1 Max? ST 1500, Mt 12k. In addition, TSMC N5p is 30% better than 10nm ESF. Consider again if a 60W part is competitive at all with a 300W part?roknonce - Friday, November 5, 2021 - link
Edit: It's 6*P-core @ 3.2Ghz + 8*E-core @2.4Ghz within 35w to roughly simulate a H35/H45 mobile chip.Ppietra - Friday, November 5, 2021 - link
The thing with Cinebench is that it takes a lot of advantage from hyperthreading, which is good of course when you have it, something that the M1 doesn’t have.The problem is, because of this and many other differences between CPUs, Cinebench is only a good benchmark to compare to the M1 in a small set of tasks. Not exactly a general definition of competition.
As for power consumption, consider that the M1 Max CPU has a peak power of 43W, while other high end Laptop CPUs have a typical peak power at around 75-80W, even if they say 45W TDP.
roknonce - Sunday, November 7, 2021 - link
I'm literally saying peak power during the test. 6*P-core @0.75v, not the BS TDP, my friend. I totally agree that Cinebench cannot tell everything. But consider the enormous gap between N5P and 10nm ESF. The result is reasonable and for intel fans, is good enough to call it inspiring.charlesg - Thursday, November 4, 2021 - link
I think there's an error in the AVX2 Peak Power graph on the last page? I think one of the two 5900s listed is supposed to be 5950?ballsystemlord - Thursday, November 4, 2021 - link
@Ian , there's a broken image on the page: "(2-6) AI Benchmark 0.1.2 Total"https://www.anandtech.com/print/17047/the-intel-12...
eddman - Thursday, November 4, 2021 - link
Everyone's going on about the performance and arguing about the power consumption; meanwhile almost all I'm thinking about is how good gracemont pentiums and celerons are going to be for affordable systems.I'm actually interested to see 6 and 8 core models, but that probably won't happen.
Zucker2k - Thursday, November 4, 2021 - link
For the spec test on page 7 you write:"For Alder Lake, we start off with a comparison of the Golden Cove cores, both in DDR5 as well as DDR4 variants. We’re pitting them as direct comparison against Rocket Lake’s Willow Cove cores, as well as AMD’s Zen3."
Shouldn't Willow Cove read as Cypress Cove instead?
Andrei Frumusanu - Thursday, November 4, 2021 - link
Yes, my bad.charlesg - Thursday, November 4, 2021 - link
As for "why buy AMD?", it remains to be seen how well Intel can keep this is in stock. Doesn't matter how good it is if you can't buy it!Furthermore, if Intel does have stock issues, odds are the price will climb above MSRP...
Igor_Kavinski - Thursday, November 4, 2021 - link
"There was a thought that if Intel were to release a version of Alder Lake with P-cores only, or if a system had all the P-cores disabled, there might be an option to have AVX-512. Intel shot down that concept almost immediately, saying very succinctly that no Alder Lake CPU would support AVX-512."Typo. Should be E-cores disabled.
NikosD - Friday, November 5, 2021 - link
Right...I saw that too.eloyard - Thursday, November 4, 2021 - link
Sooo what's the TOTAL PLATFORM POWER DRAW "per performance" during the somewhat realistic benchmark of typical usage cases?What's the TOTAL PLATFORM PRICE (i.e. price of DDR4/5, mobo, appropriate cooler and appropriate PSU) "per performance"?
Both look very bad, if i were to be the judge.
Carmen00 - Thursday, November 4, 2021 - link
P-core/E-core scheduling is not an easy problem, and it has no currently-known general-purpose satisfactory solutions: see https://dl.acm.org/doi/abs/10.1145/3387110 . P/E "works" on phones and tablets because the issues are largely masked by having a single app open at a time. You can't do that in a desktop environment. Hitching the performance of your CPU to unproven scheduler algorithms is not a smart move by Intel. I can see why they've done it, but that doesn't excuse it. You can throw a lot of money at some problems, but that doesn't mean that you're going to get to a good solution. Some of the nastier ones in Computer Science have had billions poured into them and we're no closer to a solution.My prediction is that in the absence of a very significant breakthrough, hybrid CPUs will continue to be dogged, for the foreseeable future, by weird difficult-to-reproduce performance/power glitches that no amount of firmware/OS updates are going to fix.
michael2k - Thursday, November 4, 2021 - link
Well, if Apple continues to succeed with their hybrid CPUs, it stands to reason that others (Microsoft, Intel, and AMD specifically) will at least model their approach on Apple's:https://www.extremetech.com/computing/322917-cleve...
All operations with a QoS of 9 (background) were run exclusively on the four Efficiency (Icestorm) cores, even when that resulted in their being fully loaded and the Performance cores remaining idle. Operations with any higher QoS, from 17 to 33, were run on all eight cores.
Foeketijn - Thursday, November 4, 2021 - link
So much energy is put into designing an efficient package. And the result is a packege that churns out even more engergy then the designing took.Even the biggest aircoolers will struggle to keep this cool enough under load. Don't even start about power price and carbon footprint. Worst part being for intel, They can't use this design to gain back some territory in the HEDT and server market. AMD can double the poweroutput for an Epyc or TR. That is not optional for Intel. Lets wait for the tock.
Wrs - Thursday, November 4, 2021 - link
How does Sapphire Rapids exist then?Spunjji - Friday, November 5, 2021 - link
Different design!kgardas - Thursday, November 4, 2021 - link
Great article! I especially appreciate avx512 related bits. Honestly it would be great if Intel recover from this chaos and enable avx512 in their adl-derived xeons (E-xxxx).Slash3 - Thursday, November 4, 2021 - link
This is almost certainly part of their plan.Drumsticks - Thursday, November 4, 2021 - link
Doesn't the efficiency of the product just go out the window because Intel is effectively setting the sustained performance power limit at 240W? Obviously this has an impact for consumers, as it's the power draw they will see, but for the purposes of analyzing the architecture or the process node, it doesn't seem like a great way to draw conclusions. There was a test floating around pre-NDA where the PL1 was fixed to 125/150/180/240W, and it looked like the last +60% power draw only gained +8% performance.To be sure, I'm sure Intel did it because they need that last 8% to beat the 5950X semi-consistently, and most desktop users won't worry about it too much. But if you want to analyze how efficient the P-core architecture + Intel 7 is, wouldn't it make sense to lower the sustained power limit? It's just like clocking anything else at the top of its DVFS curve - I'm sure if it was possible to set Zen 3 to a 240W power limit, we would find it wasn't nearly as efficient as Zen 3 at 141W.
abufrejoval - Thursday, November 4, 2021 - link
I have to agree with Charlie Demerjian: the sane choices for desktop parts would have been 10 P-cores on one chip and 40 E-cores on another for an edge server part.Then again for gaming clearly another set of P-cores could have been had from the iGPU, which even at only 32EU is just wasted silicon on a gaming desktop. Intel evidently doesn't mind nearly as much as AMD doing physically different dies so why not use that? (Distinct server SKUs come to mind)
Someone at Intel clearly clearly desperate enough to aim for some new USP and they even sacrified AVS-512 for (or just plain sanity) for that.
Good to see that Intel was able to reap additional instructions from every clock in a P-core (that's the true miracle!).
But since I am still running quite a few Skylake and even Haswell CPUs (in fact typing on a Haswell Xeon server doing double duty as always-on workstation), I am quite convinced that 4GHz Skylake is "good enough" for a huge amount of compute time and would really rather use all that useless E-core silicon to replace my army of Pentium J5005 Atoms, which are quite far from delivering that type of computing power on perhaps more of an electricity budget.
name99 - Friday, November 5, 2021 - link
I think most analyses (though not Charlie's) are missing Intel's primary concern here.For THESE parts, Intel is not especially interested in energy saving. That may come with the laptop parts, but for these parts the issue is that Intel wants OPTIONALITY: they want a single part that does well on single threaded (and the real-world equivalent now of 2-3 threaded once you have UI and OS stuff on separate threads) code, AND on high throughput (ie extremely threaded) code.
In the past SMT has provided this optionality; now, ADL supposedly extends it -- if you can get 4*.5x throughput for a cluster vs 2*.7x throughput for SMT on a large core, you're area-ahead with a cluster -- but you still need large cores for large core jobs. That's the balance INTC is trying to straddle.
Now compare with the other two strategies.
- For AMD, with chiplets and a high-yielding process, performance/unit area has not been a pressing concern. They're able to deliver a large number of P cores, giving you both latency (ues a few P cores) or throughput (use a lot of P cores). This works fine for server or desktop, but
+ it's not a *great* server solution. It uses too much area for server problems that are serious throughput problems (ie the area that ARM even in its weak Graviton2 and Ampere versions covers so well)
+ it does nothing for laptops. Presumably Intel's solution for laptops is not to drive the E-cores nearly as hard, which is not a great solution (compare to Apple's energy or power numbers) but better than nothing -- and certainly better than AMD.
- For Apple there is the obvious point that their P cores are great as latency cores (and "low enough" area if they feel it appropriate to add more) and their E cores great as energy cores. More interesting is the throughput question. Here I think, in true Apple fashion, they have zigged where others have zagged. Apple's solution to throughput is to go all-in on GPGPU!
For their use cases this is not at all a crazy idea. There's a reason GPGPU was considered the hot new thing a while ago, and of course nV are trying equally hard to push this idea. GPGPU works very well for most image-type stuff, for most AI/ML stuff, even for a reasonable fraction of physics/engineering problems. If you have decent tools (and Apple's combination of XCode and Metal Compute seems to be decent enough -- if you don't go in determined to hate them because they're not what you're used to whether that's CUDA or OpenCL) then GPGPU works for a lot of code.
What GPGPU *doesn't work for* is server-type throughput; no-one's saying "I really need to port HHVM, or ngingx, to a GPU and then, man, things will sing". But of course why should Apple care? They're not selling into that market (yet...)
So ultimately
- Intel are doing this because it gives them
+ better optionality on the desktop
+ lower (not great, but lower) energy usage in the laptop
+ MANUFACTURING optionality at the server. They can announce essentially a XEON-P line with only P cores, and a Xeon-E line with ?4? cores (to run the OS) and 4x as many E cores as the same-sized Xeon-P, and compete a little better with the ARM servers in the extreme throughput space.
- AMD are behind conceptually. They have their one hammer of a chiplet with 16 cores on it, and it was a good hammer for a few years. But either they missed the value of also owning a small-area core, or just didn't have the resources. Either way they're behind right now.
- Apple remain in the best position -- both for their needs, but also to grow in any direction. They can keep going downward and sideways (watch, phone, glasses, airpods...) with E cores. They can maintain their current strengths with P cores. They can make (either for specialized macs, or for their own internal use) the equivalent of an M1 Max only with a sea of E cores rather than a sea of GPU that would be a very nice target for a lot of server work. And they likely have in mind, long-term, essentially democratizing CUDA, moving massive GPGPU off crazy expensive nV rigs existing at the department level down to something usable by any individual -- basically a replay of the VAX to SparcStation sort of transition.
I think INTC are kinda, in their massive dinosaur way, aware of this hence all the talking up of Xe, PV and OneAPI; but they just will not be able to execute rapidly enough. They have the advantage that Apple started years behind, but Apple moves at 3x their speed.
nV are well aware of the issue, but without control of a CPU and a SoC, what can they do? They can keep trying to make CUDA and their GPU's better, but they're crippled by the incompetence of the CPU vendors.
AMD? Oh poor AMD. Always in this horrible place where they do what they can do pretty well -- but simply do not have the resources to grow along all the directions they need to grow simultaneously. Just like their E-core response will be later than ideal (and probably a sub-optimal scramble), so too their response to this brave new world of extreme-throughput via extreme GPGPU...
NikosD - Friday, November 5, 2021 - link
According to the latest rumors of MLID (that most of the times have been proved true) AMD's reply to E-cores is Zen 4D (D for Dense, not dimensions).Zen 4D core is a stripped-down Zen 4 core, with less cache and lower IPC, but smaller die area and less power consumption, leading to a 16 core CCD.
Also, Zen 4 core is expected to have a lot higher IPC than Alder Lake and Zen 3/3D, so it seems more than capable to compete with Raptor Lake (next Intel's architecture) this time next year.
AMD is not going to lose performance crown in the next few years.
Spunjji - Friday, November 5, 2021 - link
AMD don't *need* E cores, though? You said "it does nothing for laptops", but they're doing fine there - Zen 3 with 8 cores at 15W gives a great balance of ST and MT performance that Intel can't really touch (yet), and the die area is pretty good too. Intel need E cores to compete, AMD don't (yet).Zen 4D is reportedly going to be their answer to "E" cores, and it's probably going to cause fewer issues than having fully heterogeneous cores.
abufrejoval - Friday, November 5, 2021 - link
First of all, thanks for the exhaustive answer!I get the E/P core story in general, especially where a battery or “the edge” constrain power envelopes. I don’t quite get the benefit of a top-of-the-line desktop SoC serving as a demonstration piece for the concept.
What Intel seems to be aiming for is a modular approach where you can have “lots” or “apartments” of die area dedicated to GPU, P-cores and E-cores. Judging from the ADL die shots very roughly you can get 4 E-cores or 32 iGPU EUs from the silicon real-estate of a P-core (or 16EU + media engine). And with Intel dies using square process yields to go rectangular, you just get a variable number of lots per die variant.
Now, unfortunately, these lots can’t be reconfigured (like an FPGA) between these usage types, so you have to order and pick the combination that suits your use case and for the favorite Anandtech gamer desktop that comes out as 12 P-cores, zilch E-core, iGPU or media engine. I could see myself buying that, if I didn’t already have a 5950x sitting in that slot. I can also very much see myself buying a 2 P-core + 40 E-core variant at top bin for the same €600, using the very same silicon real-estate as the ADL i9.
Intel could (and should) enable this type of modularity in manufacturing. With a bit of tuning to their fabs they should be able to mix lot allocations across the individual dies in a wafer. The wafer as such obviously needs to be set up for a specific lot size, but what you use to fill the lots is a choice of masks.
You then go into Apple’s silicon and I see Apple trying their best to fill a specific niche with three major options, the M1, the M1Pro and the M1Max. None of these cater to gamers or cloud service providers. When it comes to HPC, max ML training performance or most efficient ML inference, their design neither targets nor serves those workloads. Apple targets media people working on the move. For that audience I can see them offer an optimal fit, that no x86-dGPU combo can match (on battery). I admire their smart choice of using 8 32-bit DRAM channels to obtain 400GB/s of bandwidth on DDR4 economy in the M1Max, but it’s not generalizable across their niche workloads. Their ARM core seems an amazing piece of kit, but when the very people who designed it thought to branch out into edge servers, they were nabbed to do better mobile cores instead… The RISC-V people will tell you about the usefulness of a general-purpose CPU focus.
I’ve been waiting for an Excel refactoring ever since Kaveri introduced heterogeneous computing, with function-level GPU/CPU paradigm changes and pointer compatibility. I stopped holding my breath for this end-user GPGPU processing.
On the machine learning side, neuromorphic hardware and wafer-scale whatnot will dominate training, for inference dedicated IP “lots” also seem the better solution. I simply don’t see Apple having any answer or relevance outside their niche, especially since they won’t sell their silicon to the open market.
I’m pretty sure you could build a nice gaming console from M1Max silicon, if they offered it at X-Box prices, but GPT-4++ won’t be trained on Apple silicon and inference needs to run on mobile, 5 Watts max and only for milliseconds.
lejeczek - Thursday, November 4, 2021 - link
Well..I mean... it's almost winter so... having problems with central heating? or thought: I'd warm up the garage? There is your answer (has been for a while) ... all you need to do - just give it a full load.abufrejoval - Saturday, November 6, 2021 - link
Just decided to let Microsoft's latest Flight Simulator update itself over night. I mean it's a Steam title,right? But it won't use Steam mechanisms for the updates, which would obviously make it much faster and more efficient, so instead you have to run the updates in-game and they just a) linearize everything on a single core, b) thus take forever even if you have fast line, c) leave the GPU running full throttle on the main menu, eating 220Watts on my 2080ti for a static display... (if there ever has been an overhyped and underperforming title the last decade or two, that one gets my nomination every time I want to fly instead of updating).Small wonder I woke up with a dry throat this morning and the machine felt like a comfy coal oven from way back then, when there was no such thing as central heating and that's how you got through winter (no spring chicken, me).
abufrejoval - Thursday, November 4, 2021 - link
Oh no! Set carry flag and clear carry flag are gone! How will we manage without them?Having to deal with the loss of my favorite instuction (AAA, which always stood out first) was already had to deal with, but there it was easier to accept that BCD (~CoBoL style large number arithmetic) wasn't going to remain popular on a 64-Bit CPU with the ability to do some type of integer arithmetic with 512-Bit vectors.
But this looks like the flag register, a dear foe of RISC types, who loved killing it, is under attack!
It's heartening to see that not all CISC is lost since they are improving the x86 gene pool with improving wonderful things like REP cmpsb and REP scasb, which I remember hand-coding via ASM statements on Turbo-C for my 80286: Ah, the things microcode could do!
It even inspired this kid from Finland to think of writing a Unix like kernel, because a task switch could be coded by doing a jump on a task state segement... He soon learned that while the instruction was short on paper, the three page variant from BSD and L4 guys was much faster than Intel's microcode!
He learned to accept others can actually code better than him and that lesson may have been fundamental to the fate of everything Linux has effected.
cyrusfox - Thursday, November 4, 2021 - link
Great initial review! So many good nuggets in here. Love that AVX512 bit as well as breaking down the cores individually. Alder lake mobile is going to be awesome! DDR5 upgrade sure is tempting, showing tangible benefit, most applications neutral, something to chew on While I figure out if I am going to get a i5 or i9 to replace my i9-10850k.Request:
Will anyone please do a review of iGPU Rocketlake vs Alder Laker. I know it is "identical design" (Intel Xe 32EU) but different node: 14nm vs 10nm/7 (Still not a fan of the rename...)
UHD 770 vs UHD 750 showdown! Biggest benefit to Intel right now during these still inflated GPU times.
abufrejoval - Thursday, November 4, 2021 - link
Eventually someone might actually do that. But don't expect any real difference, nor attractive performance. There have been extensive tests on the 96EU variant in Tiger Lake and 32EU are going to be 1/3 of that rather modest gaming performance.The power allocation to the iGPUs tends to be "prioritized", but very flat after that. So, if both iGPU and CPU performance are requested, the iGPU tends to win, but then very quickly stops to benefit after it has received its full share, while CPU core fight for everything left in the TDP pie.
That's why iGPU performance say on a Tiger Lake H 45Watt SoC won't be noticeably better than on a 15Watt Tiger Lake U.
The eeny-meeny 32EU Xe will double your UHD 650-750 performance and reach perhaps the equivalent of an Iris 550-650 with 48EU and eDRAM, but that's it.
I'd have put another two P-cores in that slice of silicon...
Roddybabes - Thursday, November 4, 2021 - link
Not too impressive for a next gen CPU on a supposedly better process node and a newly designed architecture. TechSpot's ten game average for 1080p high quality only put the i9-12900K ahead of the R9 5950X by just 2.6% (7.4% for 1% lows). If AMD's claim of V-cache adding 15% to average gaming results is true, that would give AMD an average lead of 12.4% (7.6% for 1% lows) for the same year-old design with V-cache and still using last generation DDR4 memory - now that is what I would call impressive.If you have to have top-line FPS and 1% lows, it seems that it would be prudent to just wait a little while longer for what AMD is currently cooking in its ovens.
https://www.techspot.com/review/2351-intel-core-i9...
kwohlt - Friday, November 5, 2021 - link
"supposedly better process node"I mean, considering 12900K is nearly doubling 11900K's multicore performance while consuming 20% less power, I'd say Intel 7 is certainly a large perf/watt improvement over Intel 14nm
But also, efficiency comes from more than just process node. Core architecture plays a role.
Roddybabes - Friday, November 5, 2021 - link
I agree that the performance increase over the 11900K is great but the real competition is not Intel's previous generation but AMD's current offerings. eddmann said in an earlier post about i9-12900K's PL1 performance: You can have ~92% of the performance for ~68% of the power - see link below.https://twitter.com/TweakPC/status/145596164928275...
In essence, for the benchmark in question, if Intel had set max PL1 limit to just 150W, that processor would have amazing power efficency. But guess what, Intel needed that 8% of additional perfomance to give the i9 daylight in the benchmarks, so that 68% of additional power was needed. It would be interesting to see if Intel decides to offer E variants of its top-line CPUs as they would prove to be super efficient.
From the Alderlake CPUs launched, it would seem that the i5-12600K hits AMD the hardest; it will be interesting to see how AMD responds.
Mr Perfect - Thursday, November 4, 2021 - link
You know, I didn't even notice your testbeds all still had SATA drives in them. I just assumed they'd be using whatever was contemporary for the system. This does make me wonder how often a modern benchmark has results that aren't quite what they'll be for users who actually use them. Guess we'll find out in Q2!
ricebunny - Thursday, November 4, 2021 - link
Thank you for the in depth review. If I may give a suggestion: investigate in more depth the average power consumption (or energy consumption) in some common tasks, like gaming, etc. There are review out there that found that the i9-12900 is significantly more frugal than a Ryzen 9 5950X, exactly the opposite from the impression one would get by reading this review which focuses on the vastly less insightful peak power metric (you only need this figure to gauge which PSU to buy).MikadoWu - Thursday, November 4, 2021 - link
My big question is this... Will waiting 1 more year for the 13th Gen Bring anything else need? Presently I am setting on a i7-3770, I can do everything on it I want to, other then Windows. Sadly, for me Job I do need to upgrdae to 11. I can run the beta for another year, or just dip into this unit. Thoughts?lynguist - Thursday, November 4, 2021 - link
If all you need is Windows 11 I would just go and buy something like an Intel 11400 or whichever Zen 2 or Zen 3 CPU is available for a competitive price.Intel 13th Gen will probably iron out all the irks: maybe change AVX512 to be supported on default; probably DDR5-4800 with 4 memory banks instead of 2.
And 14th Gen will probably change to a new manufacturing node.
Zizy - Thursday, November 4, 2021 - link
Great review, thank you very much! If you can do it, I would really like to see those P and E cores in a performance/joules graph together with data points from other relevant CPU cores (say also M1 Max).Sam D - Thursday, November 4, 2021 - link
Quick correction on your second page:"There was a thought that if Intel were to release a version of Alder Lake with P-cores only, or if a system had all the P-cores disabled, there might be an option to have AVX-512."
I guess the E-cores should be disabled for AVX to function?
mode_13h - Friday, November 5, 2021 - link
Yes, I saw that too. Obviously, a typo. They meant the E-cores being disabled.abufrejoval - Thursday, November 4, 2021 - link
P vs. E, iGPI vs. cores, there is only one conclusion: I want choice!While the ability to press more instructions out of a clock and higher clocks out of a slab of silicon are impressive, little impresses the cost of those gains as the performance of the E-cores vs. the silicon real-estate.
A 25-50% performance disadvantage at 25% transistor budget just shows that it's time to refactor your code, turn loops into threads. That won't ever change again, except perhaps with quantum code, so better get started last millenium.
I'd just want to be able to order a bespoke chip, where I can allocate silicon slices and TDP to E-cores, P-cores and iGPU EUs as per use-case on a given die, paying for the number of transistors and a binning premium, not some Intel market segmentation tax.
Intel doesn't seem to have issues manufacturing the various chip sizes and in fact, switching tile allocations for a die even on a single wafer shouldn't be an issue, once you've upgraded the software: The die size wouldn't change, only the allocation to P/E/GPU which means different masks, but those I guess need to be managed separately to account for mask defects/cleaning anyway.
lmcd - Thursday, November 4, 2021 - link
Masks are expensive, what on earth are you talking about?mode_13h - Friday, November 5, 2021 - link
> it's time to refactor your code, turn loops into threads.We still need lower-overhead hardware support, for this to be viable. Also, the OS should be aware of thread groups that should be scheduled in tandem. Otherwise, your thread communication overhead is going to kill any benefit you gain by this, other than for the sorts of heavy computations that people are already multi-threading.
Silver5urfer - Thursday, November 4, 2021 - link
Very fine tuned review and good one but I think you guys should have compared all the Intel CPUs since SKL to ADL in the Spec scores and the ST performance add the E cores to the mix that would have been a solid opportunity to see how Intel really scaled in these all years.Also your gaming suite got some small face lift with 2080Ti which shows how the CPUs scale. Looking at the performance and gaming keeping aside the Synthetics. The 12th gen is an improvement but it's not a revolutionary design that changes a lot. It does in terms of Z690 chipset and DDR5 era kickstart but not in terms of relative performance vs Intel 10th gen and Ryzen 5000 series processors. IPC might be double digit but overall picture is not double digit to me. It's a single digit in Gaming, maybe in productivity it's a bit higher over 10th gen but not a massive substantial amount.
The worse is the nightmare of the Win10/11 Scheduler BS, and AMD processors got massive L3 tanking problems. And this is bad showcase for AMD to be honest. Also Win11 is anti PC anti DIY anti everything, it has VBS issues, L3 cache AMD problems, Secure Boot and TPM soft locks, then the whole UX is centered around Mobile rather than power user desktop compute. Better stick with Windows 10, but unfortunately we do not know how ADL will keep up on that OS if Intel issues any Microcode updates to the Intel Thread Director.
Then the HW adoption rate, people buying these are going to be guinea pigs. I would avoid them personally and go with 10th gen (Ryzen has WHEA and USB and other bugs I'm not going there, if B2 and 3D fixes them it's great else nope) and wait for 2-3 years to make DDR5 and PCIe5.0 relevant and then buy another rig in 2024-2025 that will net a solid boost in I/O rather than this generation one product which we do not know how future holds up Intel releases every CPU with 2 Chipset options if we follow the Tick Tock where Tock also gets new Chipset this has been the case since a long time and we do not know what Raptor Lake will do on DDR4, many say it's not worth DDR5 but if you get DDR4 now Z790 which will inevitably launch because Intel loves money that will be a huge dead end.
Finally the temperatures and Power, why I do not see the temperate graphs anywhere ? Did you guys miss that or what. So the Temperature is very HOT for the 12900K, almost all of them recommend an AIO cooler 360mil minimum and the OC ceiling is gone as these run pretty hot even with that MSI 360mil AIO (ofc gaming is low it has been like that since 7th gen), looking at your Power consumption piece the P cores and E cores share the same power plane, this is what I wanted to see and also the Cache due to new Ring bus. So that also confirms some OCN coverage, these E cores boggle the P cores clockspeeds. But running without them = lost performance.
Overall a very complex and complicated messy design. Which I feel Intel did it because of BGA trashware market where ton of money is and since the ST performance despite huge IMC downgrade and Latency is fast for Gaming and parity with AMD full fat Zen 3 and AMD needs time for Zen 4 on TSMC 5N EUV plus more time for their BGA Zen 4 even and, Intel also doesn't want to put more heat since 10nmESF is already capped at this core design, 48W is max for those small cores so there's no way they can fit a 2C4T Golden Cove clocked at 5Ghz on this node. And they are going to stick with this for at-least 2-3 generations I guess. Sad. HEDT will not have this kind of rubbish at-least.
Silver5urfer - Friday, November 5, 2021 - link
I forgot the most important thing. After going through everything. Intel sabotaged LGA1200 on purpose. Looking at the Cinebench R20 score of the 12900K vs 10900K It's clear cut that 2.5K lead of the 10K marks is coming from the SKL cores you showed. And the P cores are 8K, up from 6K of 10th gen 10900K. They sandbagged the socket with 11th gen on 14nm+++ instead of 10nm and gave a hot, power hungry poor IMC poor SMT processor. Now they show this as a massive boost because it works wonders in charts looking at the big bars of the 12th gen over 10th gen.Intel is real damn scum. Now they will milk the DDR5, OEM deals and all with those PCIe5.0 and etc BS BGA trash since ST performance is so high they will easily get all those laptops and use and throw machine sales. And nobody knows how long LGA1700 will even last. Maybe upto that successor of Raptor Lake. But going DDR4 is going to bite the people in nuts once Raptor Lake launches, I bet they will launch Z790 and only DDR5 and more DMI or something.
I hope AMD Zen 4 wages a storm on these pesky BS Small cores and give a full powerful big beast with longevity on AM5 LGA. And ofc they will fix the WHEA and USB things because they know now with experience.
GeoffreyA - Friday, November 5, 2021 - link
Certainly, judging from AM4, their next socket will have a long life.idimitro - Thursday, November 4, 2021 - link
The best coverage of the launch. Really in depth and with details nobody else mentioned. Great work guys!!!dotes12 - Thursday, November 4, 2021 - link
I'm not too optimistic that W11 was actually designed to handle optimized/efficiency cores, because W11 was designed before Intel released that beast. Unfortunately that probably means that Windows will continue to be an "every other release is good" OS. W11.1 or W12 (whatever they call it) will be the best continuance of XP -> 7 -> 10 -> 11.1/12.mode_13h - Friday, November 5, 2021 - link
> W11 was designed before Intel released that beast.The companies do collaborate, extensively. During their Architecture Day, there was discussion of this. And Windows has a much longer history of hybrid CPU scheduling, since they also support ARM CPUs like Qualcomm's 8cx and even a previous hybrid CPU from Intel (Lakefield).
Also, Windows is not a static target. MS is continually releasing updates, some of which are sure to fine-tune scheduling even more.
dwade123 - Thursday, November 4, 2021 - link
Intel inside. AMD outside.m53 - Thursday, November 4, 2021 - link
[Intel 12th gen consumes less power in gaming across the board vs Ryzen 5000 series](https://www.reddit.com/r/intel/comments/qmw9fl/why... [Even the Multi threaded perf per watt is also better for 12900K compared to 5900X](https://twitter.com/capframex/status/1456244849477... It is only specific cases where 12900k need to beat 5950x in multi threaded loads it needs to crank up more power. But for typical users Intel is both the perf /watt and perf /dollar champion.zodiacfml - Friday, November 5, 2021 - link
Though the review not exceeded my expectations due to lack of in-depth power consumption testing, the charts show to me what really is going on. Intel has roughly matched AMD in CPU performance then gained some more with DDR5. Some CPU ang gaming benchmarks shows that are limited with memory performance. now AMD's V-cache makes even more sense and 15% average uplift in games more plausible.AMD still has the edge (due to cost, cooling requirements, power consumption) while Intel gains more value if one is going to utilize the iGPU.
Alistair - Friday, November 5, 2021 - link
Personally I don't find all this impressive. Intel went from -5 percent in gaming to +5 percent in gaming. Then went from -10 percent in productivity to beating the 5900x but still losing to the 5950x. Power consumption is terrible. Honestly all AMD has to do is drop their 5600X from $400 to $300 CAD, their 5800X from $500 to $400, their 5900X from $650 CAD to $500, and the 5950x from $900 CAD to $700 and I wouldn't even consider Intel.roknonce - Friday, November 5, 2021 - link
Power consumption is a curve, bro. Unlock PBO, 5950x can also eat 300+W. OMG, "Power consumption is terrible." When set P-core@4.4Ghz whose power consumption is less than 120W, its Cinebench score is 1730 single and 25000 multiple, way better than 5900x (1500/21000). I can give it 1000w, can 5900x hit 2000/27000?If you set voltage offset manually, a 6*P-core@3.2Ghz+E-core@2.4Ghz is 35W. At that time, its Cinebench score is 1300 single and 14000 multiple. which is able to compare with a 30W M1 Max, 1550/12000. Not to mention that TSMC N5p is 30% better than intel 10nm ESF.
Jasonovich - Friday, November 5, 2021 - link
Sometimes the science is not always aligned to the practicality. The crux of it really, it's unlikely you're able to use a top end air cooler for 12900K other than expensive and elaborate AIO setup (maybe the best of Noctua / DeepCool will be suffice?) but for the Ryzen 5000 series a good quality air cooling will do.Alistair - Friday, November 5, 2021 - link
Did you actually read this review? PBO does not do what you said, and anyways it is already faster with PBO off. Then you just reduce the entire review to cinebench... maybe try reading more of it?roknonce - Sunday, November 7, 2021 - link
Oh, man, honststy already faster with PBO off? Did you actually read this review? Anyway, if you watch some more reviews/videos you will find 12900k product higher fps with a way less power consumption (like 80w vs 120w or 100w vs 140w). Now the fact is 5950x is $739 from newegg while 12900kf is only 589. Don't tell me intel mb is expensive, do you really pair 5950x with a cheap mb?Bik - Friday, November 5, 2021 - link
Quick question: what if Intel to replace E cores with all P cores? The math:P cores whould have around 60% ipc increase (15% on 14++++, 19% ice lake, 19% alder lake), plus a 1ghz frequency increase, that would result in 2 P cores equals 4.03 skylake cores. And as we see in benchmarks, 8 E cores slightly more performance than 4 skylake cores. That pose a big question: Why not just do 10 P CORES on DESKTOP and remove all the complicates induced?
mode_13h - Friday, November 5, 2021 - link
The reason to keep P-cores is simple. Single-thread performance still matters.In the server market is where things could get interesting. Intel already has Atom CPUs with >= 16 cores, aimed at mostly embedded server applications that prioritize power efficiency (e.g. 5G basestations). I wonder if we'll see them expand their E-core server CPUs to encompass more traditional server markets. Future E-cores could offer their best chance at countering the perf/W advantage of ARM.
Gothmoth - Friday, November 5, 2021 - link
the power draw is a joke. yeah more performance but at what cost.it seems reviewers tend to ignore that
EnglishMike - Friday, November 5, 2021 - link
The reviewers didn't ignore anything, including the fact that when you're not pushing the CPU to 100% on all cores, the power draw is much more reasonable. For gaming, it's very similar to the equivalent Ryzen CPUs.All the reviews I've seen have mentioned the insane power draw for the rendering benchmarks, but they would have trashed the CPUs if it had been the same for all workloads.
eva02langley - Friday, November 5, 2021 - link
Defending 400W of power consumption is a joke.eva02langley - Friday, November 5, 2021 - link
Also, the Total Cost of Ownership is about 1800$ without the GPU, which is NOT cheaper than going AM4 5950x.Don't lure yourself, you need great AIO LQ, a great power supply, DDR5 and really good VRMs on your MB to achieve this... otherwise, the 12900K will throttle.
eva02langley - Friday, November 5, 2021 - link
https://youtu.be/SQAfcJ_BHyEWrs - Friday, November 5, 2021 - link
Not sure if you know what TCO is. It includes electricity and some $ allotment for software support of the hardware. Most end users don't directly deal with the latter (how many $ is a compatibility headache?) and don't own a watt-meter to calculate the former.That said, what's wrong with reusing DDR4, adapting an AM4 cooler with an LGA 1700 bracket, and reusing the typically oversized PSU? AT shows DDR4 lagging, but most DDR4 out there is way faster than 3200 CL20. That's why other review sites say DDR4 wins most benches. No reasonable user buys a 12900k or 12600k to pair with JEDEC RAM timings!
Really the only cost differentiator is the CPU + mobo. ADL and Zen3 are on very similar process nodes. One is not markedly more efficient than the other unless the throttle is pushed very differently, or in edge cases where their architectural differences matter.
isthisavailable - Friday, November 5, 2021 - link
Wake me up when intel stop K series nonsense and their motherboards stop costing twice as much as AMD only to be useless when next gen chips arrive.mode_13h - Friday, November 5, 2021 - link
Intel always gives you 2 CPU generations on any new socket. The only exception to that we've seen was Haswell, due to their cancellation of desktop Broadwell.mode_13h - Friday, November 5, 2021 - link
And besides, they had to change the socket for PCIe 5.0 and DDR5, if not also other reasons.This wasn't like the "fake" change between Kaby Lake and Coffee Lake (IIRC, some industrial mobo maker actually produced a board that could support CPUs all the way from Skylake to Coffee Lake R).
Oxford Guy - Friday, November 5, 2021 - link
So...Are we no longer going to see all the benchmark hype over AVX-512 in consumer CPU reviews?
mode_13h - Friday, November 5, 2021 - link
I think we will. They devoted a whole page to it, and benchmarked it anyway.And Ian's 3DPM benchmark is still a black box. No one knows precisely what it measures, or that it casts AVX2 performance in a fair light. I will call for him to opensource it, for as long as he continues using it.
alpha754293 - Friday, November 5, 2021 - link
Note for the editorial team:On the page titled: "Power: P-Core vs E-Core, Win10 vs Win11" (4th page), the last graph on the page has AMD Ryzen 9 5900X twice.
The second 5900X I think is supposed to be the 5950X.
Just letting you know.
Thanks.
GeoffreyA - Friday, November 5, 2021 - link
For those proclaiming the funeral of AMD, remember, this is like a board game. Intel is now ahead. When AMD moves, they'll be ahead. Ad infinitum.As for Intel, well done. Golden Cove is solid, and I'm glad to see a return to form. I expected Alder Lake to be a disaster, but this was well executed. Glad, too, it's beating AMD. For the sake of pricing and IPC, we need the two to give each other a good hiding, alternately. Next, may Zen 3+ and 4 send Alder and Raptor back to the dinosaur age! And may Intel then thrash Zen 4 with whatever they're baking in their labs! Perhaps this is the sort of tick-tock we need.
mode_13h - Friday, November 5, 2021 - link
> Intel is now ahead.Let's see how it performs within the same power envelope as AMD. That'll tell us if they're truly ahead, or if they're still dependent on burning more Watts for their performance lead.
GeoffreyA - Saturday, November 6, 2021 - link
Oh yes. Under the true metric of performance per watt, AMD is well ahead, Alder Lake taking something to the effect of 60-95% more power to achieve 10-20% more performance than Ryzen. And under that light, one would argue it's not a success. Still, seeing all those blue bars, I give it the credit and feel it won the day.Unfortunately for Intel and fans, this is not the end of AMD. Indeed, Lisa Su and her team should celebrate: Golden Cove shows just how efficient Zen is. A CPU with 4-wide decode and 256-entry ROB, among other things, is on the heels of one with 6-wide decode and a 512-entry ROB. That ought to be troubling to Pat and the team. Really, the only way I think they can remedy this is by designing a new core from scratch or scaling Gracemont to target Zen 4 or 5.
GeoffreyA - Saturday, November 6, 2021 - link
Looking at PC World's review just now, I saw that the 12900K uses less power than the 5950X when the load is lighter but more as it scales to maximum.Zoolook - Saturday, November 6, 2021 - link
To be expected, the fabric on 5950X is a big consumer and it's lit up all the time when only a few cores have work, which makes it performance/power ratio worse when under low load.Oxford Guy - Saturday, November 6, 2021 - link
How exciting that one can pay a lot for a powerful CPU in order to celebrate how it performs doing the tasks a far cheaper CPU would be more suitable for.This is really droll, this new marketing angle for Alder Lake.
GeoffreyA - Sunday, November 7, 2021 - link
Conspicuous consumption?Wrs - Saturday, November 6, 2021 - link
They don't have to redesign Golden Cove. On lightly threaded stuff the 6-wide core is clearly ahead. That's a big plus for many consumers over Zen 3. The smaller competing core is expectedly more efficient and easier to pack for multicore but doesn't have the oomph. That Intel can pack both bigger snappy cores and smaller efficient cores is what should keep Su wide awake.Notice the ease in manufacturing, too. ADL is a simple monolithic slab. Ryzen is using two CCDs and one IOD on interposer. That's one reason Zen3 was in short supply a good 6-8 months after release. It wasn't because TSMC had limited capacity for 88mm2 chips on N7. Intel can spam the market with ADL, the main limit being factory yields of the 208 mm2 chip on Intel 7.
mode_13h - Saturday, November 6, 2021 - link
> On lightly threaded stuff the 6-wide core is clearly ahead.Why do people keep calling it 6-wide? It's not. The decoder is 3 + 3. It can't decode 6 instructions per cycle from the same branch target.
From the article covering the Architecture Day presentation:
"the allocation stage feeding into the reservation stations can only process five instructions per cycle. On the return path, each core can retire eight instructions per cycle."
> That's one reason Zen3 was in short supply a good 6-8 months after release.
> It wasn't because TSMC had limited capacity for 88mm2 chips on N7.
Source?
Shortage of Ryzens was due *in part* to the fact that Epyc and Threadrippers draw from the same chiplet supply as the non-APU desktop chips. And if you tried to buy a Milan Epyc, you'd know those were even harder to find than desktop Ryzen 5000's.
AMD seems to be moving towards a monolithic approach, in Zen 4. Reportedly, all of their desktop CPUs will then be APUs.
mode_13h - Saturday, November 6, 2021 - link
BTW, the arch day quote was meant to show that it's not 6-wide anywhere else, either.GeoffreyA - Sunday, November 7, 2021 - link
6-wide might not be the idiomatic term, but Golden Cove supposedly has 6 decoders, up from 5 on Sunny Cove.Wrs - Sunday, November 7, 2021 - link
Oh we can definitely agree to disagree on how wide to call Golden Cove, but it's objectively bigger than Zen 3 and performs like a bigger core on just about every lightly threaded benchmark.One of many sources suggesting the cause of Ryzen shortage: https://www.tomshardware.com/news/amd-chip-shortag...
The theory that TSMC was simply running that short on Zen3 CCDs never made much sense to me. Covid didn't stop any of TSMC's fabs, almost all of which run fully automated. For over a year they'd been churning out Zen2's on N7 for desktop/laptop and then server, so yields on 85 mm2 of the newer Zen3 on the same N7 should have been fantastic, and they weren't going to server, not till much more recently.
But Covid impacts on the other fabs that make IOD/interposer, and the technical packaging steps, and transporting the various parts in time? Far, far more likely.
mode_13h - Monday, November 8, 2021 - link
> Oh we can definitely agree to disagree on how wide to call Golden CoveSorry, I thought you were talking about Gracemont. The arch day article indeed says it's 6-wide and not much else about the decode stage.
> Covid didn't stop any of TSMC's fabs, almost all of which run fully automated.
It triggered a demand spike, as all the kids and many office workers needed computers for school/work from home. Plus, people needing to do more recreation at home seems to have triggered an increased demand for gaming PCs.
It's well known that TSMC is way backlogged. So, it's not as if AMD could simply order up more wafers to address the demand spikes.
> they weren't going to server, not till much more recently.
Not true. We know Intel and AMD ship CPUs to special customers, even before their public release. By the time Ice Lake SP launched, Intel reported having already shipped a couple hundred thousand of them. Also, AMD needs to build up inventory before they can do a public release. So, the chiplet supply will be getting tapped for server CPUs long before the public launch date.
mode_13h - Saturday, November 6, 2021 - link
> the only way I think they can remedy this is by designing a new core from scratchI'm not sure I buy this narrative. In the interview with AMD's Mike Clark, he said AMD takes a fresh view of each new generation of Zen and then only reuses what old parts still fit. As Intel is much bigger and better-resourced, I don't see why their approach would fundamentally differ.
> or scaling Gracemont to target Zen 4 or 5.
I don't understand this. The E-cores are efficiency-oriented (and also minimize area, I'd expect). If you tried to optimize them for performance, they'd just end up looking & behaving like the P-cores.
GeoffreyA - Sunday, November 7, 2021 - link
I stand by my view that designing a CPU from scratch will bring benefit, while setting them back temporarily. Of course, am no expert, but it's reasonable to guess that, no matter how much they change things, they're still being restricted by choices made in the Pentium Pro era. In the large, sweeping points of the design, it's similar, and that is exerting an effect. Start from scratch, and when you reach Golden Cove IPC, it'll be at lower power I think. Had AMD gone on with K10, I do not doubt it would never have achieved Zen's perf/watt. Sometimes it's best to demolish the edifice and raise it again, not going to the opposite extreme of a radical departure.As for the E-cores, if I'm not mistaken, they're at greater perf/watt than Skylake, reaching the same IPC more frugally. If that's the case, why not scale it up a bit more, and by the time it reaches GC/Zen 3 IPC, it may well end up doing so with less power. Remember the Pentium M.
What I'm trying to say is, you've got a destination: IPC. These three architectures are taking different routes of power and area to get there. GC has taken a road with heavy toll fees. Zen 3, much cheaper. Gracemont appears to be on an even more economical road. The toll, even on this path, will go up but it'll still be lower than GC's. Zen, in general, is proof of that, surpassing Intel's IPC at a lower point of power.
GeoffreyA - Sunday, November 7, 2021 - link
Anyhow, this is just a generic comment by a layman who's got a passion for these things, and doesn't mean to talk as if he knows better than the engineers who built it.Wrs - Sunday, November 7, 2021 - link
It's not trivial to design a core from scratch without defining an instruction set from scratch, i.e., breaking all backward compatibility. x86 has a tremendous amount of legacy. ARM has quite a bit as well, and growing each year.Can they redo Golden Cove or Gracemont for more efficiency at same perf/more perf at same efficiency? Absolutely, nothing is perfect and there's no defined tradeoff between performance and efficiency that constitutes perfect. But simply enlarging Gracemont to near Golden Cove IPC (a la Pentium M to Conroe) is not it. By doing so you gradually sacrifice the efficiency advantage in Gracemont, and might get something worse than Golden Cove if not optimized well.
The big.LITTLE concept has proven advantages in mobile and definitely has merit with tweaks/support on desktop/server. The misconception you may have is that Golden Cove isn't an inherently inefficient core like Prescott (P4) or Bulldozer. It's just sometimes driven at high turbo/high power, making it look inefficient when that's really more a process capability than a liability.
GeoffreyA - Monday, November 8, 2021 - link
Putting together a new core doesn't necessarily mean a new ISA. It could still be x86.Certainly, Golden Cove isn't of Prescott's or Bulldozer's nature and the deplorable efficiency that results from that; but I think it's pretty clear that it's below Zen 3's perf/watt. Now, Gracemont is seemingly of Zen's calibre but at an earlier point of its history. So, if they were to scale this up slowly, while scrupously maintaining its Atom philosophy, it would reach Zen 3 at similar or less power. (If that statement seems laughable, remember that Skylake > Zen 1, and Gracemont is roughly equal to Skylake.) Zen 3 is right on Golden Cove's tail. So why couldn't Gracemont's descendant reach this class using less power? Its design is sufficiently different from Core to suggest this isn't entirely fantasy.
And the fashionable big/little does have advantages; but question is, do those outweigh the added complexity? I would venture to say, no.
mode_13h - Monday, November 8, 2021 - link
> they're still being restricted by choices made in the Pentium Pro era.No way. There's no possible way they're still beholden to any decisions made that far back. For one thing, their toolchain has probably changed at least a couple times, since then. But there's also no way they're going to carry baggage that's either not pulling its weight or is otherwise a bottleneck for *that* long. Anything that's an impediment is going to get dropped, sooner or later.
> As for the E-cores, if I'm not mistaken, they're at greater perf/watt than Skylake
Gracemont is made on a different node than Skylake. If you backported it to the original 14 nm node that was Skylake's design target, they wouldn't be as fast or efficient.
> why not scale it up a bit more, and by the time it reaches GC/Zen 3 IPC,
> it may well end up doing so with less power.
Okay, so even if you make everything bigger and it can even reach Golden Cove's IPC without requiring major parts being redesigned, it's not going to clock as high. Plus, you're going to lose some efficiency, because things like OoO structures scale nonlinearly in perf/W. And once you pipeline it and do the other things needed for it to reach Golden Cove's clock speeds, it's going to lose yet more efficiency, probably converging on what Golden Cove's perf/W.
There are ways you design for power-efficiency that are fundamentally different from designing for outright performance. You don't get a high-performance core by just scaling up an efficiency-optimized core.
GeoffreyA - Monday, November 8, 2021 - link
Well, you've stumped me on most points. Nonetheless, old choices can survive pretty long. I've got two examples. Can't find any more at present. The instruction fetch bandwidth of 16 bytes, finally doubled in Golden, goes all the way back to Pentium Pro. That could've more related to the limitations of x86 decoding, though. Then, register reads were limited to two or three per clock cycle, going back to Pentium Pro, and only fixed in Sandy Bridge. Those are small ones but it goes to show.I would say, Gracemont is different enough for it to diverge from Golden Cove in terms of perf/watt. One basic difference is that it's using a distributed scheduler design (following in the footsteps of the Athlon, Zen, and I believe the Pentium 4), compared to Pentium Pro-Golden Cove's unified scheduler. Then, it's got 17 execution ports, more than Zen 3's 14 and GC's 12. It's ROB is 256 entries, equal to Zen 3. Instruction boundaries are being marked, etc., etc. It's clock speed is lower? Well, that's all right if its IPC is higher than frequency-obsessed peers. I think descendants of this core could baffle both their elder brothers and the AMD competition.
GeoffreyA - Monday, November 8, 2021 - link
Sorry for all the it's! Curse that SwiftKey!mode_13h - Tuesday, November 9, 2021 - link
> it's got 17 execution portsThat's for simplicity, not by necessity. Most CPUs map multiple different sorts of operations per port, but Gracemont is probably designed in some way that made it cheaper for them just to have dedicated ports for each. I believe its issue bandwidth is 5 ops/cycle.
> It's clock speed is lower? Well, that's all right if its IPC is higher than frequency-obsessed peers.
It would have to be waaay higher, in order to compensate. It's not clear if that's feasible or the most efficient route to deliver that level of performance.
> I think descendants of this core could baffle both their elder brothers and the AMD competition.
In server CPUs? Quite possibly. Performance per Watt and per mm^2 (which directly correlates with perf/$) could be extremely competitive. Just don't expect it to outperform anyone's P-cores.
GeoffreyA - Wednesday, November 10, 2021 - link
I'm out of answers. I suppose we'll have to wait and see how the battle goes. In any case, what is needed is some new paradigm that changes how CPUs operate. Clearly, they're reaching the end of the road. Perhaps the answer will come from new physics. But I wouldn't be surprised there's some fundamental limit to computation. That's a thought.kwohlt - Sunday, November 7, 2021 - link
"Really, the only way I think they can remedy this is by designing a new core from scratch"Intel is designing an entirely new architecture from scratch, according to Moore's Law is Dead leaks. This new architecture design, which started under Jim Keller, is called the "Royal Core Project" and is aimed for a 2025 release. This year also aligns with Gelsinger's recent claims of "We expect Intel to definitively retake the performance crown across the board in 2025"
Whether that actually happens is to be seen, but Pat's specific year target combined with the Moore's Law is Dead leak seem to suggest a whole new architecture is very likely.
GeoffreyA - Monday, November 8, 2021 - link
Looking forward to seeing this. Could it be ARM? I wonder.mode_13h - Monday, November 8, 2021 - link
I wondered that, too. It could be Jim's revenge for the K12 getting canceled!cchi - Friday, November 5, 2021 - link
Jim Keller strikes again?mode_13h - Friday, November 5, 2021 - link
If he had any hand in Alder lake, it was probably at the margins. He arrived too late to have much involvement in the CPU and his role at Intel seems to have been more in the area of new technology development.cchi - Friday, November 5, 2021 - link
https://hardwaresfera.com/en/noticias/hardware/int...Not sure how true this is...
mode_13h - Friday, November 5, 2021 - link
Looks like a 100% clickbait article that's all weakly-inform speculation and zero substance. Go read the two recent interviews with him on this site. He talks about his position within Intel, how far removed he was from any actual work, and how his focus was more on evangelizing within the company.1. https://www.anandtech.com/show/16709/an-interview-...
2. https://www.anandtech.com/show/16762/an-anandtech-...
And, ignoring all of that, it takes 4-5 years to design a CPU and get it shipped. The timelines simply don't match up. Lastly, he was there for under 2 years, which isn't enough time to really learn how a company works and build a strong team.
Even at AMD, where he was for 3+ years, he even refused credit as the father of Zen. And it's a smaller company where he had prior history.
dwade123 - Friday, November 5, 2021 - link
The fact that Alder Lake consumes way less power than Zen 3 during gaming is amazing. This is where it matters the most because nobody will be playing Cinebench 12 hours a day. 5950x wastes a lot of powers for no good reason during gaming. I'm surprised Intel didn't advertise this heavily.mode_13h - Friday, November 5, 2021 - link
> nobody will be playing Cinebench 12 hours a day.Did you ever ask an animator how long their renders take?
adamxpeter - Friday, November 5, 2021 - link
Scroll Lock and Excel mixes well ....bananaforscale - Friday, November 5, 2021 - link
I do wonder about the scheduler interactions if we add Process Lasso into the mix.mode_13h - Friday, November 5, 2021 - link
Ian, please publish the source to your 3D Particle Movement benchmark. Let us see what the benchmark is doing. Also, it's not only AMD that can optimize the AVX2 path. Please let the community have a go at it.mode_13h - Friday, November 5, 2021 - link
> The core also supports dual AVX-512 ports, as we’re detecting> a throughput of 2 per cycle on 512-bit add/subtracts.
I thought that was true of all Intel's AVX-512 capable CPUs? What Intel has traditionally restricted is the number of FMAs. And if you look at the AVX-512 performance of 3DPM on Rocket Lake and Alder Lake, the relative improvement is only 6%. That doesn't support the idea that Golden Cove's AVX-512 is any wider than that of Cypress Cove, which I thought was established to be single-FMA.
SystemsBuilder - Saturday, November 6, 2021 - link
Cascade lake X and Skylake X/XE core i9 and Xeons with more that 12 cores (it think) have two AVX-512 capable FMA ports (port 0 and port 5) while all other AVX-512 capable CPUs have 1 (Port 0 fused).the performance gap could be down to coding. you need to vectorize your code in such a way that you feed both ports at maximum bandwidth.
However, in practice it turns out that the bottle neck is seldom the AVX-512 FMA ports but the memory bandwidth, i.e. it is very hard to keep up with the FMAs, each capable of retiring many of the high end vector operations in 4 clock cycles. e.g. multiply two vectors of 16 32bit floats and add to a 3rd vector in 4 clock cycles. Engaging both FMAs => you retire one FMA vector op every 2 cycles. Trying to avoid getting too technical here, but with a bit of math you see that the total bandwidth capability of the FMAs easily outstrips the cache, even if most vectors are kept in the Z registers – the resisters can only absorbs so much and at the steady state, the cache/memory hierarchy becomes the bottleneck depending on the problem size.
Some clever coding can work around that and hide some of the memory reads (using prefetching etc) but again there is only so much you can do. In other words two AVX-512 FMAs are beasts!
coburn_c - Friday, November 5, 2021 - link
This hybrid design smacks of 5+3 year ago thinking when they wanted to dominate mobile. Maybe that's why it needs 200+ watts to be performant.mode_13h - Friday, November 5, 2021 - link
This doesn't make sense. Their P-cores were never suitable for phones or tablets. Still aren't.I think the one thing we can say is *not* behind Alder Lake is the desire to make a phone/tablet chip. It would be way too expensive and the P-core would burn too much power at even the lowest clockspeeds.
tygrus - Saturday, November 6, 2021 - link
It appears the mixing is more trouble than they are worth for pure mid to high range desktop use. Intel should have split the Desktop CPU's from the mobile CPU's. Put P-cores in the new mid to high range desktops. Put the E-cores in mobiles or cheap desktops/NUC.Wrs - Saturday, November 6, 2021 - link
The mixing helps with a very sought-after trait of high-end desktops. Fast single/lightly threaded performance AND high multithreaded capacity. Meaning very snappy and can handle a lot of multitasking. It is true they can pump out more P cores and get rid of E cores, but that would balloon the die size and cut yields, spiking the cost.mode_13h - Saturday, November 6, 2021 - link
> AND high multithreaded capacity.Yes. This is supported with a very simple experiment. Look at the performance delta between 8 P-Cores and the full 8 + 8 configuration, on highly-threaded benchmarks. All the 8 + 8 configuration has to do is beat the P-core -only config by 25%, in order to prove it's a win.
The reason is simple. Area-wise, the 8 E-cores are equivalent to just 2 more P-cores. The way I see it is as an easy/cheap way for Intel to boost their CPU on highly-threaded workloads. That's what sold me on it. Before I saw that, I only thought Big.Little was good for power-savings in mobile.
mode_13h - Saturday, November 6, 2021 - link
Forgot to add that page 9 shows it meets this bar (I get 25.9%), but the reason it doesn't scale even better is due to the usual reasons for sub-linear scaling. Suffice it to say that a 10 P-core wouldn't scale linearly either, meaning the net effect is almost certainly better performance in the 8+8 config (for integer, at least).JayNor - Saturday, November 6, 2021 - link
"In the aggregate scores, an E-core is roughly 54-64% of a P-core, however this percentage can go as high as 65-73%."It isn't clear what you mean here. A P-core second thread on the same core would be expected to add around 30%.
A more understandable test would something like Intel presented of Gracemont 4C4T vs Skylake 2C4T, although it would also be interesting to see performance and power of 8C8T vs 2C4T of Golden Cove, since they reportedly occupy a similar layout space.
SystemsBuilder - Saturday, November 6, 2021 - link
Really happy to see AVX-512 is available with a simple BIOS switch!This looks to me like how AVX-512 should have been implemented in Sky lake, Cascade lake and Rocket lake and now they finally are getting it right:
Alder lake seams to have:
- both AVX-512 ports enabled (port 0 and 5) !
- able to run at negative offset = 0 for both AVX2 and AVX-512!
- AVX-512 power consumption seams too be in line with AVX2!
Excellent in other words! Since the silicon is there, if they can get the scheduler to manage heterogeneous (P/E) cores there is now no down side with enabling AVX-512.
-
Oxford Guy - Saturday, November 6, 2021 - link
I guess you missed the sentence about how the MSI boards don’t have the switch, the sentence about how it’s actually not supposed to be there, and the sentence about how it could be eliminated in the future.Additionally, what high-end motherboards offer in BIOS may be more than what is offered in more affordable models. Vendors might restrict this unofficial ‘support’ to top models.
The entire situation is completely incompetent. It’s patently absurd.
Oxford Guy - Saturday, November 6, 2021 - link
It raises a very serious question about Gelsinger’s leadership.All the hype about putting an engineer in charge and we have this utter inanity as the result.
mode_13h - Saturday, November 6, 2021 - link
> It raises a very serious question about Gelsinger’s leadership.I'm sure this decision never crossed his desk. It would be made probably 2+ levels below him, in the management hierarchy.
Moreover, he's been in charge for only about 8 months or so. Do you have any idea how long it takes to steer a big ship like Intel? This decision wasn't made yesterday. It would require OS support, which means they'd have had to get buy-in from Microsoft for it, many months ago.
And that's just if you're talking about the decision not to allow partial AVX-512 enabling. The decision to exclude it from Gracemont was made many years ago. Its exclusion was possibly considered a necessity for Gracemont's success, due to the perf/area -> perf/$ impact.
Oxford Guy - Saturday, November 6, 2021 - link
If Gelsinger wasn’t aware of the lie about fusing off and all of the other critically-important aspects involved he’s either a charlatan or Intel is structurally incompetent.Wrs - Saturday, November 6, 2021 - link
Why all the fuss about a technically unsupported feature? The only consumer chips officially to have AVX-512 contain Rocket Lake cores. Not Zen 3, or 2, or Comet Lake, or Alder Lake. If you find your Alder Lake has hidden AVX-512 abilities, how's that any different from finding out you can enable 6 cores on your 4-core Celeron?mode_13h - Saturday, November 6, 2021 - link
> The only consumer chips officially to have AVX-512 contain Rocket Lake cores.Ice Lake and Tiger Lake do, but they're only in laptops, NUCs, and SFF PCs.
zodiacfml - Sunday, November 7, 2021 - link
that guy is hating on Gelsinger.Qasar - Sunday, November 7, 2021 - link
that guy hates on everythingmode_13h - Saturday, November 6, 2021 - link
On what basis do you reach that verdict? Based on your posts, I wouldn't trust you to run a corner shop, much less one of the biggest and most advanced tech & manufacturing companies on the planet.And where did they said it's "fused off"? AFAIK, all they said is that it's not available and this will not change. And we've seen no evidence of that being untrue.
Also, I think you're getting a bit too worked up over the messaging. In the grand scheme, that's not the decision that really matters.
SystemsBuilder - Saturday, November 6, 2021 - link
no I did no miss that. I'm just happy that ASUS found a way to enable it.Intel screwed up of course - battel between different departments and managers, marketing etc I'm sure - that's a given and did not think it was necessary to repeat that. And yes it is absurd - even incompetent.
Still I'm happy ASUS found it and exposed it, because, as I said they they actually seam to have gotten AVX-512 right in Golden cove.
Intel should of course work with Microsoft to get the scheduler to work with any E/P mix, make the support official, enable it in the base BIOS, have base BIOS sent over to all OEMs and lastly fire/reassign the idiot that took AVX-512 off the POR for Alder lake.
In any case it give me something to look forward to with Sapphire rapids which should come with more Golden cove P cores.
I only by ASUS boards so
SystemsBuilder - Saturday, November 6, 2021 - link
can't edit post so continuing:I only buy and use ASUS boards so for me it's fine but I sucks for others.
Also doubt that Pat was involved. Decisions were likely made before his arrival. I'm thinking about the Microsoft dependency. They would have needed to lock the POR towards Microsoft a while back to give MS enough time to get the scheduler and other stuff right...
Oxford Guy - Saturday, November 6, 2021 - link
The product was released on his watch, on these incompetent terms. Gelsinger is absolutely responsible. He now has a serious black mark on his leadership card. A competent CEO wouldn’t have allowed this situation to occur.This is an outstanding example of how the claim that only engineers make good CEOs for tech companies is suspect.
‘I only by ASUS boards so’
Lies of omission are lies.
SystemsBuilder - Saturday, November 6, 2021 - link
I totally agree that it goes against putting engineers in charge.for me the whole AVX-512 POR decision and "AVX-512 is fused off" message is coming out of a incompetent marketing department when they were still in charge.
Oxford Guy - Saturday, November 6, 2021 - link
‘when they were still in charge.’Gelsinger isn’t the CEO? Gelsinger wasn’t the CEO when Alder Lake was released? Marketeers outrank the CEO?
The buck stops with him.
The implication that having an engineer run a business makes said engineer a skilled businessperson is safely dead.
The success of Steve Jobs also problematized that claim well before this episode, as he was not an engineer.
SystemsBuilder - Saturday, November 6, 2021 - link
i think under the old "regime" marketing did out rank engineering so that the old CEO listened more to marketing than engineering (of course the CEO makes the decision but he takes input from various camps and that is what i mean with marketing "were still in charge"). A non-engineering educated CEO is particularly influenceable by marketing (especially if he/she has MBA with marketing specialization like the old CEO's). Hence the messaging decisions to "fuse it off" was likely heavily influenced by marketing who i think finally won over Engering. Pat had to inherit this decision but could not change i for windows 11 launch - it was too late.Oxford Guy - Saturday, November 6, 2021 - link
Of course he could change it. He’s the CEO.mode_13h - Saturday, November 6, 2021 - link
> so that the old CEO listened more to marketing than engineeringIn this case, the issue wouldn't be who the CEO listens to, but who gets to define the products. Again, the issue of AVX-512 in Alder Lake is something that would probably never rise to the attention of the CEO, in a company with $75B annual revenue, tens of thousands of employees at hundreds of sites, and many thousands of products in hundreds of different markets. OG apparently has no concept of what these CEOs deal with, on a day to day basis.
mode_13h - Saturday, November 6, 2021 - link
> Gelsinger wasn’t the CEO when Alder Lake was released?So what was he supposed to do? Do you think they run all PR material by the CEO? How would he have any time to make the important decisions, like about running the company and stuff?
It seems to me like you're just playing agent provocateur. I haven't even seen you make a good case for why this matters so much.
mode_13h - Saturday, November 6, 2021 - link
> A competent CEO wouldn’t have allowed this situation to occur.And you'd know this because... ?
> This is an outstanding example of how the claim that only engineers
> make good CEOs for tech companies is suspect.
You're making way to much of this.
I don't know who says "only engineers make good CEOs for tech companies". That's an absolutist statement I doubt anyone reasonable and with suitable expertise to make such proclamations ever would. There are plenty of examples where engineers in the CEO's chair have functioned poorly, but also many where they've done well. The balance of that particular assessment doesn't hang on Gelsinger, and especially not on this one issue.
Also, your liberal arts degree is showing. I'm not casting any aspersions on liberal arts, but when you jump to attack engineers for stepping outside their box, it does look like you've got a big chip on your shoulder.
Oxford Guy - Sunday, November 7, 2021 - link
‘Also, your liberal arts degree is showing. I'm not casting any aspersions on liberal arts’Your assumption about my degrees is due to the fact that I understand leadership and integrity?
mode_13h - Sunday, November 7, 2021 - link
> Your assumption about my degrees is due to the fact that I understand leadership and integrity?Yes, exactly. It's exactly your grasp of leadership and integrity that I credit for your attack on engineers stepping outside their box. Such a keen observation. /s
(and that's how you "sarcasm", on the Internet.)
kwohlt - Sunday, November 7, 2021 - link
I'm not sure how familiar you are with CPU design, but Alder Lake was taped in before Gelsinger took over. The design was finalized, and there was no changing it without massive delays. For the miniscule amount of the market that insists on AVX-512 for the consumer line, it can be implemented after disabling E Cores. AVX-512 just doesn't work on Gracemont, so you can't have both Gracemont and AVX-512 simultaneously. CPU designs take 4 years. You'll see the true impact of Gelsingers leadership in a few years.SystemsBuilder - Saturday, November 6, 2021 - link
MS and intel tried to sync their plans to launch Windows 11 and Alderlake at (roughly) the same time. intel might have been rushed to lock their POR to hit Windows 11 launch. There may even be a contractual relationship between Intel and Microsoft to make sure Windows 11 runs best on Intel's Alder Lake - Intel pay MS to optimize the scheduler for Alder lake and in return Intel has to lock the Alder Lake POR maybe even up to a year go... because MS was not going to move the Windows 11 launch date.Speculation from my side of course, but I don't think I am too far off...
Oxford Guy - Saturday, November 6, 2021 - link
Such excuses don’t work.The current situation is inexcusable.
SystemsBuilder - Saturday, November 6, 2021 - link
yes it is inexcusable BUT the Pat might not have had a choice because he does not control Microsoft.Satya N. would just tell Pat we have a contract - fulfill it!
We are not going to delay Windows 11 it's shipping October 2021 so we will stick with the POR you gave us in 2020!
Satya is running a $2.52 trillion market cap company current #1 in the world
Pat is running a $206.58 billion market cap company
so guess who's calling the shots.
Pat says "ok... but maybe we can enable it for the 22H1 version of win 11, please Satya help me out here..."
in the end I think MS will do the right thing and get it to work but it might get delayed a bit.
Again, my speculation. And again, I don't think I am far off...
Oxford Guy - Saturday, November 6, 2021 - link
The solution was not to create this incompetent partial ‘have faith’ AVX-512 situation. Faith is for religion, not tech products.The solution was to be clear and consistent. For instance, if Windows is the problem then that should have been made clear. Gelsinger should have said MS doesn’t yet have its software at full compatibility with Alder Lake. He should have said it will be officially supported when Windows is ready for it.
He should have had a software utility for power users to disable the small cores in order to have AVX-512 support, or at least a BIOS option — mandated for all Alder Lake boards — that disables them as an option for those who need AVX-512.
The current situation cannot be blamed on Microsoft. Intel has the ability to be clear, consistent, and competent about its products.
Claiming that Intel isn’t a large enough entity to tell the truth also doesn’t pass muster. Even if it’s inconvenient for Microsoft to be exposed for releasing Windows 11 prematurely and even if it’s inconvenient for Intel to be exposed for releasing Alder Lake prematurely — saving face isn’t an adequate excuse for creating a situation this untenable.
Consumers deserve non-broken products that aren’t sold via smoke and mirrors tactics.
SystemsBuilder - Saturday, November 6, 2021 - link
a couple of points:- yes it would have been better to communicate to the market that AVX-512 will be enabled with 22H1 (or what ever - speculating) of windows 11 but what about making it work with windows 10 and when... i mean the whole situation it's a cluster. I do agree that the current marketing decision under Pat's what and how to communicate to the market what is happening with Alder Lake and AVX-512 and Windows 10/11 could have been handled much, much better. the way they have done it is a disaster. it's like is it in or out i mean wtf. is it strategic or not. This market communicating, related decisions and what every new agreements they need to strike with Microsoft to make the whole thing make sense is on Pat - firmly!
- i am not blaming Microsoft at all. I am mostly blaming the old marketing and the old CEO - pure incompetence for getting Intel into this situation in the first place. I don't have all the insights into Intel's internals but from an outside perspective it looks like that to me.
Oxford Guy - Saturday, November 6, 2021 - link
Gelsinger’s responsibility is to lead, not blame previous leadership.Alder Lake came out on his watch. The AVX-512 debacle, communications and lack of mandated minimum specs (official partial support for the lifetime of AL in 100% of AL boards via BIOS switch to disable small cores) happened to while he was CEO.
The lie about fusing off happened under his leadership.
We have been lied to and spacetime can’t be warped to erase the stain on his tenure.
SystemsBuilder - Saturday, November 6, 2021 - link
I don't think Pat's blaming previous leadership. I am. I also blame Pat to per above.He can still fix it by being super clear about what, when, why in his communication. He needs to bring marketing messaging under control.
I can tell one thing though. I'm not buying Alder Lake CPUs. I'm probably going for Sapphire Rapids next year when the whole thing have hopefully settled a bit more.
mode_13h - Saturday, November 6, 2021 - link
> We have been lied to and spacetime can’t be warped to erase the stain on his tenure.I'd have to say that, based on your tenor, I'd be more concerned about the stain in your pants.
Now *that's* an "ad hom". You're welcome.
:D
Oxford Guy - Sunday, November 7, 2021 - link
‘I'd have to say that, based on your tenor, I'd be more concerned about the stain in your pants.Now *that's* an "ad hom". You're welcome.’
Posting drivel like that exposes your character for all to see.
mode_13h - Sunday, November 7, 2021 - link
> Posting drivel like that exposes your character for all to see.What I hope they see is that I can keep a measure of proportion about these things and not get overwrought. Not all "righteous indignation" is so righteous.
Qasar - Sunday, November 7, 2021 - link
" Posting drivel like that exposes your character for all to see."hello pot, meet kettle.
Oxford Guy - Monday, November 8, 2021 - link
Thanks for yet another fallacy, Qasar. mode is doing well enough polluting the forum with them. Help really isn’t required.Qasar - Monday, November 8, 2021 - link
you do a good job with the pollurion as well oxford guy :-)mode_13h - Saturday, November 6, 2021 - link
> The solution was to be clear and consistent.So far, they have. No AVX-512. That ASUS figured out it was still present and could be enabled isn't Intel's fault. It's like back in the days when some motherboards would let you enable dark cores in your CPU.
> Gelsinger should have said
It's not his job. Your issue is with someone at Intel several levels below him.
Oxford Guy - Sunday, November 7, 2021 - link
‘It's not his job. Your issue is with someone at Intel several levels below him.’Lying to the public (‘fused off’) is a decision that rests on his shoulders.
That isn’t the only one.
mode_13h - Sunday, November 7, 2021 - link
> Lying to the public (‘fused off’) is a decision that rests on his shoulders.Exactly where did they say it's "fused off", and when has it ever been inexcusable that hardware shipped to customers actually contains features that can secretly be enabled? This sort of thing happens all the time.
mode_13h - Saturday, November 6, 2021 - link
> Consumers deserve non-broken products that aren’t sold via smoke and mirrors tactics.What's broken, exactly? They said you wouldn't have AVX-512. That someone figured out how to enable it is just bonus.
mode_13h - Saturday, November 6, 2021 - link
Why are you convinced it's so consequential?mode_13h - Saturday, November 6, 2021 - link
Oops, that was a response to:OG> The current situation is inexcusable.
Oxford Guy - Sunday, November 7, 2021 - link
That question is meritless.mode_13h - Sunday, November 7, 2021 - link
If the issue isn't terribly consequential, then why is it inexcusable? The gravity of alleged misconduct usually derives from its impacts.Oxford Guy - Monday, November 8, 2021 - link
I have been suspicious that you’re some sort of IBM AI. Posts like that go a long way toward supporting that suspicion.You were the poster who claimed it’s of little consequence. I was the poster who said it’s inexcusable. Either you’re AI that needs work or your mind is rife with confusion in your quest to impress the community via attempts at domination.
Not a good look, again. Posting your own claims as if they’re mine and using my claims to create a false incompetence situation is a bit better than your pathetic schoolyard taunts. So, perhaps I should praise you for improving the quality of your posts via being merely incompetent — like Intel’s handling of this situation you’re trying to downplay. I shouldn’t make that equivalence, though, as lying to the community in terms of a retail product is worse than any of your parlor tricks.
mode_13h - Tuesday, November 9, 2021 - link
> I have been suspicious that you’re some sort of IBM AI.No way. Their artificial intelligence is no match for my natural stupidity.
:D
> You were the poster who claimed it’s of little consequence.
No, I asked *you* why it's so consequential.
> I was the poster who said it’s inexcusable.
Which sort of implies that it's very consequential. If it's of not, then why would it be inexcusable?
> Either you’re AI that needs work or your mind is rife with confusion in your quest to
> impress the community via attempts at domination.
If you wouldn't waste so much energy posturing and just answer the question, maybe we could actually get somewhere.
I don't honestly care what the community thinks of me. That's the beauty of pseudonymity! I don't even need people to believe I'm somehow affiliated with a prestigious university. Either my points make sense and are well-founded or they aren't. Similarly, I don't care if you're "just" the Oxford garbage collector. If you contribute useful information, then we all win. If you're just trolling, flaming, or pulling the thread into irrelevant tangents, then we all lose.
The main reason I post on here is to share information and to learn. I asked what should be a simple question which you dismissed as meritless, and without explaining why. As usual, only drama ensues, when I try to press the issue. I always want to give people the opportunity to justify their stance, but so often you just look for some way to throw it back in my face.
This kind of crap is extremely low value. I hope you agree.
mode_13h - Saturday, November 6, 2021 - link
> and the sentence about how it could be eliminated in the future.It's true. Intel can disable instructions in microcode updates and in future steppings of the CPU. So, even having the BIOS option is no guarantee.
mode_13h - Saturday, November 6, 2021 - link
> Since the silicon is there, if they can get the scheduler to manage> heterogeneous (P/E) cores there is now no down side with enabling AVX-512.
This will not happen. The OS scheduler cannot compensate for lack of app awareness of the heterogeneous support for AVX-512. I'm sure that was fiercely debated, at Intel, but the performance downsides for naive code (i.e. 99%+ of the AVX-512 code in the wild) would generate too many complaints and negative publicity from the apps where enabling it results in performance & power regressions.
Oxford Guy - Saturday, November 6, 2021 - link
So, Alder Lake is a turkey as a high-end CPU, one that should have never been released? This is because each program has to include Alder Lake AVX-512 support and those that don’t will cause performance regressions?So, Intel designed and released a CPU that it knew wouldn’t be properly supported by Windows 11 — yet the public was sold Windows 11 primarily on the basis of how its nifty new scheduler will support this CPU?
‘The OS scheduler cannot compensate for lack of app awareness of the heterogeneous support for AVX-512’
Is Windows 11 able to support a software utility to disable the low-power cores once booted into Windows or are we restricted to disabling them via BIOS? If the latter is the case then Intel had the responsibility for mandating such a switch for all Alder Lake boards, as part of the basic specification.
mode_13h - Saturday, November 6, 2021 - link
> So, Alder Lake is a turkey as a high-end CPU, one that should have never been released?How do you reach that conclusion, after it blew away its predecessor and (arguably) its main competitor, even without AVX-512?
> This is because each program has to include Alder Lake AVX-512 support and
> those that don’t will cause performance regressions?
No, my point was that relying on the OS to trap AVX-512 instructions executed on E-cores and then context-switch the thread to a P-core is likely to be problematic, from a power & performance perspective. Another issue is code which autodetects AVX-512 won't see it, while running on an E-core. This can result in more than performance issues - it could result in software malfunctions if some threads are using AVX-512 datastructures while other threads in the same process aren't. Those are only a couple of the issues with enabling heterogeneous support of AVX-512, like what some people seem to be advocating for.
> Is Windows 11 able to support a software utility to disable the low-power cores
> once booted into Windows or are we restricted to disabling them via BIOS?
That's not the proposal to which I was responding, which you can see by the quote at the top of my post.
Oxford Guy - Sunday, November 7, 2021 - link
So, you’ve stated the same thing again — that Intel knew Alder Lake couldn’t be fully supported by Windows 11 even before it (AL) was designed?The question about the software utility is one you’re unable to answer, it seems.
mode_13h - Sunday, November 7, 2021 - link
> The question about the software utility is one you’re unable to answer, it seems.That's not something I was trying to address. I was only responding to @SystemsBuilder's idea that Windows should be able to manage having some cores with AVX-512 and some cores without.
If you'd like to know what I think about "the software utility", that's a fair thing to ask, but it's outside the scope of what I was discussing and therefore not a relevant counterpoint.
Oxford Guy - Monday, November 8, 2021 - link
More hilarious evasion.mode_13h - Tuesday, November 9, 2021 - link
> More hilarious evasion.Yes, evasion of your whataboutism. Glad you enjoyed it.
GeoffreyA - Sunday, November 7, 2021 - link
"So, Intel designed and released a CPU that it knew wouldn’t be properly supported by Windows 11"Oxford Guy, there's a difference between the concerns of the scheduler and that of AVX512. Alder Lake runs even on Windows 10. Only, there's a bit of suboptimal scheduling there, where the P and E cores are concerned.
If AVX512 weren't disabled, it would've been something of a nightmare keeping track of which cores support it and which don't. Usually, code checks at runtime whether a certain set of instructions---SSE3, AVX, etc---are available, using the CPUID instruction or intrinsic. Stir this complex yeast into the soup of performance and efficiency cores, and there will be trouble in the kitchen.
Under this is new, messy state of affairs, the only feasible option mum had, or should I say Intel, was bringing the cores onto a equal footing by locking AVX512 in the attic, and saying, no, that fellow doesn't live here.
GeoffreyA - Sunday, November 7, 2021 - link
Also, Intel seems pretty clear that it's disabled and so forth. Doesn't seem shady or controversial to me:https://www.intel.com/content/www/us/en/developer/...
SystemsBuilder - Saturday, November 6, 2021 - link
Thinking a bit about what you wrote: "This will not happen". And it is not easy but possible… it’s a bit technical but here we go… sorry for the wall of text.When you optimize code today (for pre Alder lake CPUs) to take advantage of AVX-512 you need to write two paths (at least). The application program (custom code) would first check if the CPU is capable of AVX-512 and at what level. There are many levels of AVX-512 support and effectively you need write customized code for each specific CPUID (class of CPUs , e.g. Ice lake, Sky lake X etc.) since for whatever CPU you end up running this particular program on, you would want to utilize the most favorable/relevant AVX-512 instructions. So with the custom code today (Pre Alder lake) the scheduler would just assign a tread to a underutilized core (loosely speaking) and the custom code would check what the core is capable off and then chose best path in real time (AVX2 and various level of AVX-512). The problem is that with Alder Lake not all cores are equal! BUT the custom code should have various paths already so it is capable!… the issue that I see is that the custom code CPU check needs to be adjusted to check core specific capability not CPUID specific (one more level of granularity) AND the scheduler should schedule code with AVX-512 paths on AVX-512 capable cores by preference... what’s needed is a code change in the AVX-512 path selection logic ( on the application developer - not a big deal) and compiler support that embed scheduler specific information about if the specific piece of code prefers AVX-512 or not. The scheduler would then use this information to schedule real time and the custom code would be able to choose the right path at execution time.
It is absolutely possible and it will come with time.
I think this is that this is not just applicable to AVX-512. I think in the future P and E cores might have more than just AVX-512 that is different (they might diverge much more than that) so the scheduler needs to be made aware of what a thread prefers and what the each core is capable of before it schedules each tread. It is the responsibility of the custom code to have multiple paths (if they want to utilize AVX-512 or not).
SystemsBuilder - Saturday, November 6, 2021 - link
old .exe which are not adjusted and are not recompiled for Alder Lake (code does not recognize Alder Lake) would simply automatically regress to AVX2 and the scheduler would not care which CPU to schedule it on. Basically that is what's happening today if you do not enable AVX-512 in the ASUS bios.Net net: you could make it would work.
mode_13h - Saturday, November 6, 2021 - link
> old .exe which are not adjusted and are not recompiled for Alder Lake (code does> not recognize Alder Lake) would simply automatically regress to AVX2
So, like 98% of shipping AVX-512 code, by the time Raptor Lake is introduced?
What you're proposing is a lot of work for Microsoft, only to benefit a very small number of applications. I think Intel would rather that people who need those apps simply buy CPU which officially support AVX-512 (or maybe switch off their E-cores and enable AVX-512 in BIOS).
Oxford Guy - Sunday, November 7, 2021 - link
‘or maybe switch off their E-cores and enable AVX-512 in BIOS’This from exactly the same person who posted, just a few hours ago, that it’s correct to note that that option can disappear and/or be rendered non-functional.
I am reminded of your contradictory posts about ECC where you mocked advocacy for it (‘advocacy’ being merely its mention) and proceeded to claim you ‘wish’ for more ECC support.
Once again, it’s helpful to have a grasp of what one actually believes prior to posting. Allocating less effort to posting puerile insults and more toward substance is advised.
mode_13h - Sunday, November 7, 2021 - link
> This from exactly the same person who posted, just a few hours ago, that it’s> correct to note that that option can disappear and/or be rendered non-functional.
You need to learn to distinguish between what Intel has actually stated vs. the facts as we wish them to be. In the previous post you reference, I affirmed your acknowledgement that the capability disappearing would be consistent with what Intel has actually said, to date.
In the post above, I was leaving open the possibility that *maybe* Intel is actually "cool" with there being a BIOS option to trade AVX-512 for E-cores. We simply don't know how Intel feels about that, because (to my knowledge) they haven't said.
When I clarify the facts as they stand, don't confuse that with my position on the facts as I wish them to be. I can simultaneously acknowledge one reality, which maintaining my own personal preference for a different reality.
This is exactly what happened with the ECC situation: I was clarifying Intel's practice, because your post indicated uncertainty about that fact. It was not meant to convey my personal preference, which I later added with a follow-on post.
Having to clarify this to an "Oxford Guy" seems a bit surprising, unless you meant like Oxford Mississippi.
> you mocked advocacy
It wasn't mocking. It was clarification. And your post seemed more to express befuddlement than expressive of advocacy. It's now clear that your post was a poorly-executed attempt at sarcasm.
Once again, it's helpful not to have your ego so wrapped up in your posts that you overreact when someone tries to offer a factual clarification.
Oxford Guy - Monday, November 8, 2021 - link
I now skip to the bottom of your posts If I see more of the same preening and posing, I spare myself the rest of the nonsense.mode_13h - Tuesday, November 9, 2021 - link
> If I see more of the same preening and posing, I spare myself the rest of the nonsense.Then I suggest you don't read your own posts.
I can see that you're highly resistant to reason and logic. Whenever I make a reasoned reply, you always hit back with some kind of vague meta-critique. If that's all you've got, it can be seen as nothing less than a concession.
O-o-o-O - Saturday, November 6, 2021 - link
Anyone talking about dumping x64 ISA?I don't see AVX-512 a good solution. Current x64 chips are putting so much complexity in CPU with irrational clock speed that migrating process-node further into Intel4 on would be a nightmare once again.
I believe most of the companies with in-house developers expect the end of Xeon-era is quite near, as most of the heavy computational tasks are fully optimized for GPUs and that you don't want coal burning CPUs.
Even if it doesn't come in 5 year time-frame, there's a real threat and have to be ahead of time. After all, x86 already extended its life 10+ years when it could have been discontinued. Now it's really a dinosaur. If so, non-server applications would follow the route as well.
We want more simple / solid / robust base with scalability. Not an unreliable boost button that sometimes do the trick.
SystemsBuilder - Saturday, November 6, 2021 - link
I don't see AVX-512 that negatively it is just the same as AVX2 but double the vectors size and a with a richer instruction set. I find it pretty cool to work with especially when you've written some libraries that can take advantage of it. As I wrote before, it looks like Golden cove got AVX-512 right based on what Ian and Andrei uncovered. 0 negative offset (e.g. running at full speed), power consumption not much more than AVX2, and it supports both FP16 and BP16 vectors! I think that's pretty darn good! I can work with that! Now I want my Sapphire rapids with 32 or 48 Golden cove P cores! No not fall 2022 i want it now! lolmode_13h - Saturday, November 6, 2021 - link
> When you optimize code today (for pre Alder lake CPUs) to take advantage> of AVX-512 you need to write two paths (at least).
Ah, so your solution depends on application software changes, specifically requiring them to do more work. That's not viable for the timeframe of concern. And especially not if its successor is just going to add AVX-512 to the E-cores, within a year or so.
> There are many levels of AVX-512 support and effectively you need write customized
> code for each specific CPUID
But you don't expect the capabilities to change as a function of which thread is running, or within a program's lifetime! What you're proposing is very different. You're proposing to change the ABI. That's a big deal!
> It is absolutely possible and it will come with time.
Or not. ARM's SVE is a much better solution.
> I think in the future P and E cores might have more than just AVX-512 that is different
On Linux, using AMX will require a thread to "enable" it. This is a little like what you're talking about. AMX is a big feature, though, and unlike anything else. I don't expect to start having to enable every new ISA extension I want to use, or query how many hyperthreads actually support - this becomes a mess when you start dealing with different libraries that have these requirements and limitations.
Intel's solution isn't great, but it's understandable and it works. And, in spite of it, they still delivered a really nice-performing CPU. I think it's great if technically astute users have/retain the option to trade E-cores for AVX-512 (via BIOS), but I think it's kicking a hornets nest to go down the path of having a CPU with asymmetrical capabilities among its cores.
Hopefully, Raptor Lake just adds AVX-512 to the E-cores and we can just let this issue fade into the mists of time, like other missteps Intel & others have made.
SystemsBuilder - Saturday, November 6, 2021 - link
I too believe AVX-512 exclusion in the E cores it is transitory. next gen E cores may include it and the issue goes away for AVX-512 at least (Raptor Lake?). Still there will be other features that P have but E won't have so the scheduler needs to be adjusted for that. This will continue to evolve with every generation of E and P cores - because they are here to stay.I read somewhere a few months ago but right now i do not remember where (maybe on Anandtech not sure) that the AVX-512 transistor budget is quite small (someone measured it on the die) so not really a big issue in terms of area.
AMX is interesting because where AVX-512 are 512 bit vectors, AMX is making that 512x512 bit matrices or tiles as intel calls it. Reading the spec on AMX you have BF16 tiles which is awesome if you're into neural nets. Of course gpus will still perform better with matrix calculations (multiplications) but the benefit with AMX is that you can keep both the general CPU code and the matrix specific code inside the CPU and can mix the code seamlessly and that's gonna be very cool - you cut out the latency between GPU and CPU (and no special GPU API's are needed). but of course you can still use the GPU when needed (sometimes it maybe faster to just do a matrix- matrix add for instance just inside the CPU with the AMX tiles) - more flexibility.
Anyway, I do think we will run into a similar issue with AMX as we have the AVX-512 on Alder Lake and therefore again the scheduler needs to become aware of each cores capabilities and each piece of code need to state what type of core they prefer to run on: AVX2, AVX-512, AMX capable core etc (the compliers job). This way the scheduler can do the best job possible with every thread.
There will be some teething for a while but i think this is the direction it is going.
mode_13h - Sunday, November 7, 2021 - link
The difference is that AMX is new. It's also much more specialized, as you point out. But that means that they can place new hoops for code to jump through, in order to use it.It's very hard to put a cat like AVX-512 back in the bag.
SystemsBuilder - Saturday, November 6, 2021 - link
To be clear, I also want to add that the way code is written today (in my organization) pre Alder Lake code base. Every time we write a code path for AVX512 we need to write a fallback code path incase the CPU is not AVX-512 capable. This is standard (unless you can control the execution H/W 100% - i.e. the servers).Does not mean all code has to be duplicated but the inner loops where the 80%/20% rule (i.e. 20% of the code that consumes 80% of the time, which in my experience often becomes like the 99%/1% rule) comes into play that's where you write two code paths:
1 for AVX-512 in case it CPU is capable and
2 with just AVX2 in case CPU is not capable
mostly this ends up being just as I said the inner most loops, and there are excellent broadly available templates to use for this.
Just from a pure comp sci perspective it is quite interesting to vectorize code and see the benefits - pretty cool actually.
mode_13h - Sunday, November 7, 2021 - link
I'm not even going to say this is a bad idea. The problem is that it's a big change and Intel normally prepares the software developer community for big new ISA extensions a year+ in advance!Again, what you're talking about is an ABI change, which is a big deal. Not only that, but to require code to handle dynamically switching between AVX2 and AVX-512 paths means that it can't use different datastructures for each codepath. It even breaks the task pre-emption model, since there need to be some limitations on where the code needs to have all its 512-bit registers flushed so it can handle switching to the AVX2 codepath (or vice versa).
This adds a lot of complexity to the software, and places a greater testing burden on software developers. All for (so far) one CPU. It just seems a bit much, and I'm sure a lot of software companies would just decide not to touch AVX-512 until things settle down.
GeoffreyA - Sunday, November 7, 2021 - link
My view on this topic is that Intel made a sound decision disabling AVX512. Some of the comments are framing it as if they made a mistake, because the tech community discovered it was still there, but I don't see any problem. Only, the wording was at fault, this controversial "fused off" statement. And actually, the board makers are at fault, too, enabling a hidden feature and causing more confusion.On the question of whether it's desirable, allowing one core with the instructions and another without, would've been a recipe for disaster---and that, too, for heaven knows what gain. The simplest approach was bringing both cores onto the same footing. Indeed, I think this whole P/E paradigm is worthless, adding complexity for minimal gain.
Oxford Guy - Monday, November 8, 2021 - link
‘Intel made a sound decision disabling AVX512’That’s not what happened.
O-o-o-O - Sunday, November 7, 2021 - link
Really? Our tech guys tried out Xeon Phi but couldn't make use of it. Years later, Xeon Phi was abruptly discontinued due to lack of demand. GPGPUs are much easier to handle.Yeah, coding cost and risks aside, it's interesting to see complex work of art in the modern CPU. But I'd rather wish for expansion of GPU support (like shared memory and higher band-width).
kwohlt - Sunday, November 7, 2021 - link
My understanding is that Raptor Lake's change is replacing Golden Cover P cores with Raptor Cove P cores, doubling Gracemont E-Cores per SKU, and using the same Intel 7 process. Granted, it's all leaks at this point, but with Gracemont being reused for Raptor Lake, I don't expect AVX-512 next year either.mode_13h - Monday, November 8, 2021 - link
> Raptor Lake's change is ... doubling Gracemont E-Cores ... using the same Intel 7 process.I was merely speculating that this *might* just be a transient problem. If they're using the same process node for Raptor Lake, which seems very plausible, then it's understandable if they don't want to increase the size or complexity of their E-cores.
However, there's some precedent, in the form of Knights Landing, where Intel bolted on dual AVX-512 pipelines + SMT4 to a Silvermont Atom core. And with a more mature Intel 7 node, perhaps the yield will support the additional area needed for just a single pipe + 512-bit registers. And let's not forget how Intel increased the width of Goldmont, yet simply referred to it as Goldmont+.
So, maybe Raptor Lake will use Gracemont+ cores that are augmented with AVX-512. We can hope.
GURU7OF9 - Saturday, November 6, 2021 - link
The is by far the best review I have read so far.A great comparison I would love to see just out of curiouslty would be to see P core only benchmarks and then e core only benchmarks! We could gain a much better understanding of the capabilities and performance of both .
This would bring a little bit of familiarity back to benchmarking .
nunya112 - Saturday, November 6, 2021 - link
the only info provided was its on intels new process 7 node. what does that mean? are they using TSMC and at 7nm? or did they finally crack 7nm at Intel?mode_13h - Sunday, November 7, 2021 - link
"Intel 7" is the process node formerly known as "10 nm ESF" (Enhanced SuperFin), which is the 4th generation 10 nm process, counting by the revisions they've introduced between the different products based on it. They like to pretend that Cannon Lake didn't happen, but that's why Ice Lake was actually 10 nm+ (2nd gen).They rebranded 10 nm ESF as "Intel 7" for marketing reasons, as explained here:
https://www.anandtech.com/show/16823/intel-acceler...
Hossein - Sunday, November 7, 2021 - link
It's funny that most reviewers are conveniently silent about the fact that there are quite a 'few' games which are incompatible AL.Kvaern1 - Sunday, November 7, 2021 - link
Because there are no games which are 'incompatible'' with ADL.eastcoast_pete - Sunday, November 7, 2021 - link
While AL is an interesting CPU (regardless of what one's preference is), I still think the star of AL is the Gracemont core (E cores), and did some very simple-minded, back of a napkin calculations. The top AL has 8 (P cores with multithreading) = 16 + 8 E core threads (no multithreading here) for a total of 24 threads. According to first die shots, one P core requires the same die area as 4 E cores. That leaves me wanting an all-E core CPU with the same die size as the i9 AL, because that could fit 8x4= 32 plus the existing 8 Gracemonts, for a total of 40. And, the old problem of "Atoms can't do AVX and AVX2" is solved - because now they can! Yes, single thread performance would be significantly lower, but any workload that can take advantage of many threads should be at least as fast as on the i9. Anyone here knows if Intel is considering that? It wouldn't be the choice for gaming, but for productivity, it might give both the i9 and, possibly, the 5950x a run for the money.mode_13h - Monday, November 8, 2021 - link
They currently make Atom-branded embedded server CPUs with up to 24 cores. This one launched last year, using Tremont cores:https://ark.intel.com/content/www/us/en/ark/produc...
I think you can expect to see a Gracemont-based refresh, possibly with some new product lines expanding into non-embedded markets.
eastcoast_pete - Monday, November 8, 2021 - link
Yes, those Tremont-based CPUs are intended/sold for 5G cell stations; I hope that Intel doesn't just refresh those with Gracemont, but makes a 32-40 Gracemont core CPU available for workstations and servers. The one thing that might prevent that is fear (Intel's) of cannibalizing their Sapphire Rapid sales. However, if I would be in their shoes, I'd worry more about upcoming AMD and multi-core ARM server chips, and sell all the CPUs they can.mode_13h - Tuesday, November 9, 2021 - link
Well, it's a start that Intel is already using these cores in *some* kind of server CPU, no? That suggests they already should have some server-grade RAS features built-in. So, it should be a fairly small step to use them in a high core count CPU to counter the Gravitons and Altras. I think they will, since it should be more competitive in terms of perf/W.As for workstations, I think you'll need to find a workstation board with a server CPU socket. I doubt they'll be pushing massive E-core -only CPUs specifically for workstations, since workstation users also tend to care about single-thread performance.
anemusek - Sunday, November 7, 2021 - link
Sorry but performance it isn't all +- a few percent in the real world will not restore confidence. Critical flaws, disabling functionality (dx12 in hanswell for example), instabbility instruction features etc.I cannot afford to trust such a company
Dolda2000 - Sunday, November 7, 2021 - link
I just wanted to add a big Kudos for this article. AnandTech's coverage of the 12900K was by a wide margin the best of any I read or watched, with regards to coverage of the various variables involved, and with the breadth and depth of testing. Thanks for keeping it up!chantzeleong - Monday, November 8, 2021 - link
I run Power bi and tensorflow with large dataset. Which Intel CPU do you recommend and why?mode_13h - Tuesday, November 9, 2021 - link
I don't know about "Power bi", but Tensorflow should run best on GPUs. Which CPU to get then depends on how many GPUs you're going to use. If >= 3, then Threadripper. Otherwise, go for Alder Lake or Ryzen 5000 series.You'll probably find the best advice among user communities for those specific apps.
velanapontinha - Monday, November 8, 2021 - link
We've seen this before. It is time to short AMD, unfortunately.mode_13h - Tuesday, November 9, 2021 - link
Well, AMD does have V-Cache and Zen 3+ in the queue. But if you want to short them, be my guest!Sivar - Monday, November 8, 2021 - link
This is an amazingly deep, properly Anandtech review, even ignoring time constraints and the unusual difficulty of this particular launch.I bet Ian and Andrei will be catching up on sleep for weeks.
xhris4747 - Tuesday, November 9, 2021 - link
Hiricebunny - Tuesday, November 9, 2021 - link
It’s disappointing that Anandtech continues to use suboptimal compilers for their platforms. Intel’s Compiler classic demonstrated 41% better performance than Clang 12.0.0 in the SPECrate 2017 Floating Point suite.mode_13h - Wednesday, November 10, 2021 - link
I think it's fair, though. Most workloads people run aren't built with vendor-supplied compilers, they use industry standards of gcc, clang, or msvc. And the point of benchmarks it to give you an idea of what the typical user experience would be.ricebunny - Wednesday, November 10, 2021 - link
But are they not compiling the code for the M1 series chips with a vendor supplied compiler?Second, almost all benchmarks in SPECrate 2017 Floating Point are scientific codes, half of which are in Fortran. That’s exactly the target domain of the Intel compiler. I admit, I am out of date with the HPC developments, but back when I was still in the game icc was the most commonly used compiler.
mode_13h - Thursday, November 11, 2021 - link
> are they not compiling the code for the M1 series chips with a vendor supplied compiler?It's just a slightly newer version of LLVM than what you'd get on Linux.
> almost all benchmarks in SPECrate 2017 Floating Point are scientific codes,
3 are rendering, animation, and image processing. Some of the others could fall more in the category of engineering than scientific, but whatever.
> half of which are in Fortran.
Only 3 are pure fortran. Another 4 are some mixture, but we don't know the relative amounts. They could literally link in BLAS or some FFT code for some trivial setup computation, and that would count as including fortran.
https://www.spec.org/cpu2017/Docs/index.html#intra...
BTW, you conveniently ignored how only one of the SPECrate 2017 int tests is fortran.
mode_13h - Thursday, November 11, 2021 - link
Oops, I accidentally counted one test that's only SPECspeed.So, in SPECrate 2017 fp:
3 are fortran
3 are fortran & C/C++
7 are only C/C++
ricebunny - Thursday, November 11, 2021 - link
Yes, I made the same mistake when counting.Without knowing what the Fortran code in the mixed code represents I would not discard it as irrelevant: those tests could very well spend a majority of their time executing Fortran.
As for the int tests, the advantage of the Intel compiler was even more pronounced: almost 50% over Clang. IMO this is too significant to ignore.
If I ran these tests, I would provide results from multiple compilers. I would also consult with the CPU vendors regarding the recommended compiler settings. Anandtech refuses to compile code with AVX512 support for non Alder Lake Intel chips, whereas Intel’s runs of SPECrate2017 enable that switch?
xray9 - Sunday, November 14, 2021 - link
> At Intel’s Innovation event last week, we learned that the operating system> will de-emphasise any workload that is not in user focus.
I see performance critical for audio applications which need near-real time performance.
It's already a pain to find good working drivers that do not allocate CPU core for too long, not to block processes with near-realtime demands.
And for performance tuning we use already the Windows option to priotize for background processes, which gives the process scheduler a higher and fix time quantum, to be able to work more efficient on processes and to lower the number of context switches.
And now we get this hybrid design where everything becomes out of control and you can only hope and pray, that the process scheduling will not be too bad. I am not amused about that and very skeptical, that this will work out well.
mode_13h - Monday, November 15, 2021 - link
Do you know, for a fact, that the new scheduling policies override the priority-boost you mentioned? I wouldn't assume so, but I'm not saying they don't.Maybe I'm optimistic, but I think MS is smart enough to know there are realtime services that don't necessarily have focus and wouldn't break that usage model.
ZioTom - Monday, November 29, 2021 - link
Windows 11 scheduler fails to allocate workloads...I noticed that the scheduler parks the cores if the application isn't full screen.
I did a test on a 12700k with Handbrake: as long as the program window remains in the foreground, all the Pcore and Ecore are allocated at 100%. If I open a browser and use it while the movie is being compressed, the kernel takes the load off the Pcore and runs the video compression only on the Ecores. Absurd behavior, absolutely useless!
alpha754293 - Wednesday, January 12, 2022 - link
I have my 12900K for a little less than a month now and here's what I've found from the testing that I've done with the CPU:(Hardware notes/specs: Asus Z690 Prime-P D4 motherboard, 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM (128 GB total), running CentOS 7.7.1908 with the 5.14.15 kernel)
IF your workload CAN be multithreaded and it can run on BOTH the P cores AND the E cores simultaneously, then there is a potential that you can have better performance than the 5950X. BUT if you CAN'T run your application on both the P cores and the E cores at the same time (which a number of distributed parallel applications that rely on MPI), then you WON'T be able to realise the performance advantages that having both said P cores and E cores would give you (based on what the benchmark results show).
And if your program, further, cannot use HyperThreading (which some HPC/CAE program will actually lock you out of doing so), then you can be upwards of anywhere between 63-81% SLOWER than the 5950X (because on the 5950X, even with SMT disabled, you can still run the programme on all 16 physical cores, vs. the 8 P cores on the 12900K).
Please take note.
alceryes - Wednesday, August 24, 2022 - link
Question.Did you use 'affinities' for all the different core tests (P-core only, P+E-core tests)?