I'm expecting no later than November 25, because that's the official date at which they go on sale. It could be sooner, but at this point waiting 1 day or 4 days won't make much of a difference.
It makes sense. Especially as chips get smaller and move to a chiplet design. If you run only 1 core and run it constantly, you add a hotspot to your CPU that not even the heatspreader can dissipate heat quickly enough from. This actually leads to overall lower performance, because the entire die can heat up to the point where ALL cores slow down.
In an odd way, **Ryzen Master** looks a bit like AMD Over-Drive. That's a good thing.
I do likes me some fancy utilities and tools for optimizing performance. I used to luv the old 'PhenomMsrTweaker' for under-volting and over-clocking ...
"Bodge rather than fix" seems like a running theme, e.g. the 20°C temperature reporting offset in order to boost fan speeds for certain models (rather than addressing that in the firmware) that even AMD forgot to account for in the early Ryzen Master releases!
Thank you so much for the post and very detailed information to those looking for clarification
cheers all at anandtech much appreciated
granted this is very much AMD "overall" hopefully in time they are that much better prepared and "orchestrated" to help themselves thereby all their customers as well as "us" the end consumer
Kudos to team at AMD as well, rollercoaster ride I am sure.....here is also hoping China and USA can put aside whatever squabbles they currently have to make a very real impasse(not sure if this is the best way to word such) so that we ALL can enjoy the best of EVERYTHING the world has to offer us "pesky human beings" before 2020 starts it's engine.
Stay safe and warm all (safe and cool) depending on where you hail from
I hope all of this just disappears with Zen3 on 7nm+. Hopefully because TSMC's 7nm+ process is more mature which keeps these huge variances of core capabilities at a minimum.
I would abandon all hope that Microsoft will fix their scheduler. The people who wrote it are no longer working there and no one knows exactly how it works now. Any tampering will most probably lead to breaking things that shouldn't be broken. The only good solution will be a complete rewrite, but I don't see Microsoft commiting the resources and time for this, since it won't give them any benefit.
The scheduler is stupid because it has not, until recently, been in a position to benefit from improvements that introduce awareness of AMD's CCX arrangement. As for Microsoft's ability to make modifications of it and the status of a certain set of employees that are capable of making changes is something I do not feel is something you are in a position to speak with authority regarding.
Wow, well finally this is better understood now. Thank you for actually asking the questions and getting some real info.
So for the Windows scheduler - I feel like there should be a tool in Windows, either manually run at the user's option or completely automatic and unseen, that allows you to run a small test workload that includes 1,2,3,X... thread workloads and determines the optimal core usage in each scenario. It then stores that config info however needed to be used in a very basic level by the (improved) Windows scheduler. If that info exists, the scheduler loads that when starting and uses it.
Now it wouldn't be perfect, and maybe you'd get different results with shorter/longer workload runtimes, but it would be better.
Problem is also that the windows scheduler is pluggable and games for example often have their own. And all those also dont handle anything other than a normal Intel very well... it is a mess.
The problem is different thread workloads have different performance profiles. There's really no clean way to handle this sort of situation.
You also make the classic mistake of assuming no other program is running in the background, which would immediately break your assumptions about thread management. That's why stuff like this really has to be handled automatically by the OS.
Well hey, so yes I provided a simple view there, and I can imagine things are quite complicated. But I didn't say it shouldn't be handled automatically by the OS.
I still think it could be useful to have some sort of pre-testing. Imagine 90% of a particular CPUs time is spent in one app's workload and the other 10% is spent in everything else. Maybe it's some distributed computing thing. Run some tests trying different configurations for placing threads on the cores/CCXs/etc. It will perform better in some situations than others and that could really save computing time. If both the app and the OS are witness to this result, then I don't see why the OS shouldn't use that info.
Sure that is a specific example, and I definitely don't know how the details and responsibilities would be worked out for the pre-testing and what could be done to generalize it. But it shows it can be useful, right? All it takes is one example to show that.
The above commentor jospoortvliet even said that the Windows scheduler is pluggable and games can have their own setup. If an app can suggest its preferred scheduling to Windows, then that tells me it's already a short way down this path. Perhaps the app could do its own pre-testing before suggesting it's preferred scheduling, and/or perhaps it could evaluate on the fly and update its suggestion periodically throughout a workload.
Getting this "perfect is the enemy of good" in this whole turbo maximizing issue. Core X is 0.5% faster than Core y. Lets complicate the scheduling to fix this injustice. Not that any of this benefits when the system is at 100% load since all cores are in use.
The CCX behavior you see in Windows actually changes based on Windows Power Plan. There was some threads awhile ago about how under the high performance type power plans, all eight cores seem pretty equally utilized. If you switch to a Power Saver plan, the workloads really get pushed onto one CCX, presumably to shut the other CCX down to save CPU power.
Well, yes and no. They have to give wrong data to the Windows scheduler because it is so bad. They will now sadly adjust their software so it also gives you wrong info... sigh.
I think you, tech journalists/testers, are responsible for a large part of this mess. Let me explain.
In a modern system, there are thousands of threads at any given time. They come in 4 flavors: 1) Overwhelming majority are performance-irrelevant threads of various purposes either waiting on some mutex/event/semaphore (for example, hardware event) or at worst on a spin lock, and executing for just a few ms between locks. They almost don't matter for performance, and the best OS strategy is to allocate them to some slow, energy efficient cores when they wake up.
2) Quick, burst single threads executing more than say 100ms (so their execution would be noticeable for a user) but nor more than a couple of seconds, but their performance is important for responsiveness of applications - they should be allocated to the best core and run at full speed, period. Long-term thermals do not matter. And don't you touch it, leave it be, on the fastest available core.
3) Performance-critical applications run at full thread load, # of such threads is a multiple of # of logical CPUs in the system (usually 1 thread per application per logical CPU). ALL modern applications which care about performance at all operate in this mode. What thread is allocated where does not matter AS LONG AS THE THREADS are kept in the same place and not thrown around because of the cache and non-uniform memory considerations.
4) Long-running single threads coming from old non-optimized applications or more often from FREAKING TESTS. And unfortunately it is what is passed for "single thread performance" by you guys. And vendors are forced to optimize for this BS.
Stop it. Just stop. Test single-thread performance where it matters - burst speed, lasting for ~1 second. Like loading a page with a large JavaScript framework attached (as a single file after concatenation/minifying of course).
All tools able to run in multi-threaded mode should ONLY be run in multi-threaded mode. Everything else is misinforming the customers presenting like important something which is never or almost never going to be encountered in real life.
Sigh. Yet another case of hardware/firmware hard-coding behaviour and abusing specifications to cater for the current version of Windows. This totally won't cause any problems on non-Windows operating systems, or future/past versions of Windows...
Problems? No. Minimally lower performance on systems other then Windows? Probably, but not necessarily - while default info supplied by UEFI will change, there is no reason for CPU driver not tu use additional information (which will probably still be available). AMD is planning to change default source of info not remove core quality info completely. I suppose Linux will still be able to use and account for core quality (because it does not use core rotation).
What Windows is lacking is an ability to disable core rotation on desktop and server systems - the user should have an ability to choose. My cooling solution can keep 3900X in ST applications below 60C even overclocked to 4.6Ghz, only when 6 or more threads are running full speed heat becomes a problem and crosses 60C threshold. Of course that does not count AVX2 loads - they are a different kind of beast ;)
"Windows should just move to the linux kernel... better performance, less mess."
The linux kernel is a mess of a design; from everything I've seen, Windows is architected better.
The problem is the scheduler on Windows still treats all cores as equal; it's dump to things like CCX hopping for L2/L3 cache effects. What Windows needs is some way to get core loading characteristics back to the scheduler, so it can better handle these types of cases.
So I understand that for the generic API/protocol they're surfacing information about the cores in order to be ideal for MS Windows, to compensate for the way its thread CPU Scheduler works; makes sense by market share. My understanding is that Linux should actually use the information from the AMD Proprietary API instead, because Linux's CPU Scheduler is different. If Linux's kernel trusted the generic information it would actually hurt performance a little.
Maybe for AMD's EPYC line of CPU's though they should use the generic protocol to surface the RIGHT information. Because EPYC CPU's are mostly for servers, and the majority market share for servers is Linux. So if they're catering to market share, then their EPYC CPU's shouldn't use the same logic designed to "optimize" for Window's CPU Scheduler, but should instead optimize for Linux's CPU Scheduler.
In my opinion, there is a lot of fuss about ryzen's 3000 peak frequency. I am not saying that is not an important topic and I loved the article, but there are more relevant areas of improvements for ryzen 3000, we are too accustomed to intel architectures, it will take time for the market to "culturally" adjust. One thing for sure, windows scheduler needs to be better
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
59 Comments
Back to Article
Alistair - Thursday, November 21, 2019 - link
Damn, this is why I come to Anandtech. Great post.close - Monday, November 25, 2019 - link
If anything AT could now stand for AndreiTech. A few steps beyond just running the same benchmarks you can find on any other site or YouTube video.satai - Thursday, November 21, 2019 - link
BTW weren't Threadripper 3xxx reviewes expected to appear at Nov 19?Andrei Frumusanu - Thursday, November 21, 2019 - link
Nope, you'll have to wait a bit longer.satai - Thursday, November 21, 2019 - link
Thanks.Is this date known or secret?
Ian Cutress - Thursday, November 21, 2019 - link
3000G was 19th.satai - Thursday, November 21, 2019 - link
Doesn't sound threadrippery to me ;-)Does the embargo for TR reviews end at their release date?
nevcairiel - Thursday, November 21, 2019 - link
AMD generally doesn't do reviews before release at all, its always on the same day.satai - Friday, November 22, 2019 - link
Thanks.ZoZo - Thursday, November 21, 2019 - link
I'm expecting no later than November 25, because that's the official date at which they go on sale.It could be sooner, but at this point waiting 1 day or 4 days won't make much of a difference.
royalartistmg - Friday, November 29, 2019 - link
nicecheck out this https://www.royalartistmg.com/
Korguz - Friday, November 29, 2019 - link
why ??????eek2121 - Thursday, November 21, 2019 - link
November 25th.Jorgp2 - Thursday, November 21, 2019 - link
Wait, windows still does core rotation?Wasn't it the main reason turbo didn't work with Vista?
Andrei Frumusanu - Thursday, November 21, 2019 - link
Gotta reset your prefetchers and retrain your branch predictors every now and then for fun.eek2121 - Thursday, November 21, 2019 - link
It makes sense. Especially as chips get smaller and move to a chiplet design. If you run only 1 core and run it constantly, you add a hotspot to your CPU that not even the heatspreader can dissipate heat quickly enough from. This actually leads to overall lower performance, because the entire die can heat up to the point where ALL cores slow down.Smell This - Thursday, November 21, 2019 - link
In an odd way, **Ryzen Master** looks a bit like AMD Over-Drive. That's a good thing.I do likes me some fancy utilities and tools for optimizing performance. I used to luv the old 'PhenomMsrTweaker' for under-volting and over-clocking ...
edzieba - Thursday, November 21, 2019 - link
"Bodge rather than fix" seems like a running theme, e.g. the 20°C temperature reporting offset in order to boost fan speeds for certain models (rather than addressing that in the firmware) that even AMD forgot to account for in the early Ryzen Master releases!Dragonstongue - Thursday, November 21, 2019 - link
Thank you so much for the post and very detailed information to those looking for clarificationcheers all at anandtech much appreciated
granted this is very much AMD "overall" hopefully in time they are that much better prepared and "orchestrated" to help themselves thereby all their customers as well as "us" the end consumer
Kudos to team at AMD as well, rollercoaster ride I am sure.....here is also hoping China and USA can put aside whatever squabbles they currently have to make a very real impasse(not sure if this is the best way to word such) so that we ALL can enjoy the best of EVERYTHING the world has to offer us "pesky human beings" before 2020 starts it's engine.
Stay safe and warm all (safe and cool) depending on where you hail from
(^.^)
FreckledTrout - Thursday, November 21, 2019 - link
I hope all of this just disappears with Zen3 on 7nm+. Hopefully because TSMC's 7nm+ process is more mature which keeps these huge variances of core capabilities at a minimum.extide - Thursday, November 21, 2019 - link
Well the fact that each CCD will be a single 8-core CCX will also help a lot with this.guycoder - Thursday, November 21, 2019 - link
Unless you are unlucky enough to be running more than 8 cores :)benedict - Thursday, November 21, 2019 - link
I would abandon all hope that Microsoft will fix their scheduler. The people who wrote it are no longer working there and no one knows exactly how it works now. Any tampering will most probably lead to breaking things that shouldn't be broken. The only good solution will be a complete rewrite, but I don't see Microsoft commiting the resources and time for this, since it won't give them any benefit.PeachNCream - Thursday, November 21, 2019 - link
The scheduler is stupid because it has not, until recently, been in a position to benefit from improvements that introduce awareness of AMD's CCX arrangement. As for Microsoft's ability to make modifications of it and the status of a certain set of employees that are capable of making changes is something I do not feel is something you are in a position to speak with authority regarding.PeachNCream - Thursday, November 21, 2019 - link
Typo in the table. Second column reads "Rryzen" so thereofre has an extra letter r that is not necessary.mikato - Thursday, November 21, 2019 - link
Wow, well finally this is better understood now. Thank you for actually asking the questions and getting some real info.So for the Windows scheduler - I feel like there should be a tool in Windows, either manually run at the user's option or completely automatic and unseen, that allows you to run a small test workload that includes 1,2,3,X... thread workloads and determines the optimal core usage in each scenario. It then stores that config info however needed to be used in a very basic level by the (improved) Windows scheduler. If that info exists, the scheduler loads that when starting and uses it.
Now it wouldn't be perfect, and maybe you'd get different results with shorter/longer workload runtimes, but it would be better.
jospoortvliet - Friday, November 22, 2019 - link
Problem is also that the windows scheduler is pluggable and games for example often have their own. And all those also dont handle anything other than a normal Intel very well... it is a mess.gamerk2 - Friday, November 22, 2019 - link
The problem is different thread workloads have different performance profiles. There's really no clean way to handle this sort of situation.You also make the classic mistake of assuming no other program is running in the background, which would immediately break your assumptions about thread management. That's why stuff like this really has to be handled automatically by the OS.
mikato - Tuesday, November 26, 2019 - link
Well hey, so yes I provided a simple view there, and I can imagine things are quite complicated. But I didn't say it shouldn't be handled automatically by the OS.I still think it could be useful to have some sort of pre-testing. Imagine 90% of a particular CPUs time is spent in one app's workload and the other 10% is spent in everything else. Maybe it's some distributed computing thing. Run some tests trying different configurations for placing threads on the cores/CCXs/etc. It will perform better in some situations than others and that could really save computing time. If both the app and the OS are witness to this result, then I don't see why the OS shouldn't use that info.
Sure that is a specific example, and I definitely don't know how the details and responsibilities would be worked out for the pre-testing and what could be done to generalize it. But it shows it can be useful, right? All it takes is one example to show that.
The above commentor jospoortvliet even said that the Windows scheduler is pluggable and games can have their own setup. If an app can suggest its preferred scheduling to Windows, then that tells me it's already a short way down this path. Perhaps the app could do its own pre-testing before suggesting it's preferred scheduling, and/or perhaps it could evaluate on the fly and update its suggestion periodically throughout a workload.
bcronce - Thursday, November 21, 2019 - link
Getting this "perfect is the enemy of good" in this whole turbo maximizing issue. Core X is 0.5% faster than Core y. Lets complicate the scheduling to fix this injustice. Not that any of this benefits when the system is at 100% load since all cores are in use.LarsBars - Thursday, November 21, 2019 - link
The CCX behavior you see in Windows actually changes based on Windows Power Plan. There was some threads awhile ago about how under the high performance type power plans, all eight cores seem pretty equally utilized. If you switch to a Power Saver plan, the workloads really get pushed onto one CCX, presumably to shut the other CCX down to save CPU power.Assimilator87 - Thursday, November 21, 2019 - link
So AMD's obfuscating the best electrical cores info because people kept whining about misinterpreted data?extide - Friday, November 22, 2019 - link
No because you get better real-world performance this way.Elfear - Thursday, November 21, 2019 - link
Excellent article Andrei. Very informative and in depth.shabby - Thursday, November 21, 2019 - link
I wonder if the intel article below will be put up top...CyrIng - Thursday, November 21, 2019 - link
Can we get the CPPC2 specs ?Would like to add it into CoreFreq
Supercell99 - Thursday, November 21, 2019 - link
Someone post the TDLR version. I ain't reading all that, is AMD lying or not?jospoortvliet - Friday, November 22, 2019 - link
Well, yes and no. They have to give wrong data to the Windows scheduler because it is so bad. They will now sadly adjust their software so it also gives you wrong info... sigh.peevee - Thursday, November 21, 2019 - link
I think you, tech journalists/testers, are responsible for a large part of this mess.Let me explain.
In a modern system, there are thousands of threads at any given time. They come in 4 flavors:
1) Overwhelming majority are performance-irrelevant threads of various purposes either waiting on some mutex/event/semaphore (for example, hardware event) or at worst on a spin lock, and executing for just a few ms between locks. They almost don't matter for performance, and the best OS strategy is to allocate them to some slow, energy efficient cores when they wake up.
2) Quick, burst single threads executing more than say 100ms (so their execution would be noticeable for a user) but nor more than a couple of seconds, but their performance is important for responsiveness of applications - they should be allocated to the best core and run at full speed, period. Long-term thermals do not matter. And don't you touch it, leave it be, on the fastest available core.
3) Performance-critical applications run at full thread load, # of such threads is a multiple of # of logical CPUs in the system (usually 1 thread per application per logical CPU). ALL modern applications which care about performance at all operate in this mode. What thread is allocated where does not matter AS LONG AS THE THREADS are kept in the same place and not thrown around because of the cache and non-uniform memory considerations.
4) Long-running single threads coming from old non-optimized applications or more often from FREAKING TESTS. And unfortunately it is what is passed for "single thread performance" by you guys. And vendors are forced to optimize for this BS.
Stop it. Just stop. Test single-thread performance where it matters - burst speed, lasting for ~1 second. Like loading a page with a large JavaScript framework attached (as a single file after concatenation/minifying of course).
All tools able to run in multi-threaded mode should ONLY be run in multi-threaded mode. Everything else is misinforming the customers presenting like important something which is never or almost never going to be encountered in real life.
asmian - Thursday, November 21, 2019 - link
Thanks for this moment of sanity.Gigaplex - Friday, November 22, 2019 - link
Sigh. Yet another case of hardware/firmware hard-coding behaviour and abusing specifications to cater for the current version of Windows. This totally won't cause any problems on non-Windows operating systems, or future/past versions of Windows...mat9v - Friday, November 22, 2019 - link
Problems? No. Minimally lower performance on systems other then Windows? Probably, but not necessarily - while default info supplied by UEFI will change, there is no reason for CPU driver not tu use additional information (which will probably still be available). AMD is planning to change default source of info not remove core quality info completely. I suppose Linux will still be able to use and account for core quality (because it does not use core rotation).What Windows is lacking is an ability to disable core rotation on desktop and server systems - the user should have an ability to choose. My cooling solution can keep 3900X in ST applications below 60C even overclocked to 4.6Ghz, only when 6 or more threads are running full speed heat becomes a problem and crosses 60C threshold. Of course that does not count AVX2 loads - they are a different kind of beast ;)
jospoortvliet - Friday, November 22, 2019 - link
Windows should just move to the linux kernel... better performance, less mess.gamerk2 - Friday, November 22, 2019 - link
"Windows should just move to the linux kernel... better performance, less mess."The linux kernel is a mess of a design; from everything I've seen, Windows is architected better.
The problem is the scheduler on Windows still treats all cores as equal; it's dump to things like CCX hopping for L2/L3 cache effects. What Windows needs is some way to get core loading characteristics back to the scheduler, so it can better handle these types of cases.
gamerk2 - Friday, November 22, 2019 - link
You *can* override core affinity, and turn off cores for an application if you so desire.Of course, nothing is stopping some other application from doing the same, leading to more performance oddities...
eva02langley - Friday, November 22, 2019 - link
It's great to see real article and not a plain attempt for clickbait or propaganda.techglitch - Friday, November 22, 2019 - link
So I understand that for the generic API/protocol they're surfacing information about the cores in order to be ideal for MS Windows, to compensate for the way its thread CPU Scheduler works; makes sense by market share. My understanding is that Linux should actually use the information from the AMD Proprietary API instead, because Linux's CPU Scheduler is different. If Linux's kernel trusted the generic information it would actually hurt performance a little.Maybe for AMD's EPYC line of CPU's though they should use the generic protocol to surface the RIGHT information. Because EPYC CPU's are mostly for servers, and the majority market share for servers is Linux.
So if they're catering to market share, then their EPYC CPU's shouldn't use the same logic designed to "optimize" for Window's CPU Scheduler, but should instead optimize for Linux's CPU Scheduler.
vamsi.vadrevu2000 - Sunday, November 24, 2019 - link
Now that is an excellent article. I couldn't find this info anywhere else and I always wondered why Ryzen Master was showing me best performing core.umano - Monday, November 25, 2019 - link
In my opinion, there is a lot of fuss about ryzen's 3000 peak frequency. I am not saying that is not an important topic and I loved the article, but there are more relevant areas of improvements for ryzen 3000, we are too accustomed to intel architectures, it will take time for the market to "culturally" adjust. One thing for sure, windows scheduler needs to be betterurbanman2004 - Monday, December 16, 2019 - link
3700X FTW 😅