AMD's 3rd generation Opteron versus Intel's 45nm Xeon: a closer look
by Johan De Gelas on November 27, 2007 6:00 AM EST- Posted in
- IT Computing
3ds Max 9 (32-bit Windows)
We tested with the 32-bit version of 3ds Max version 9, which has improvements that help multi-core systems but which is not as aggressively tuned for SSE as LINPACK and zVisuel. We used the "architecture" scene, which has been a favorite benchmarking scene for years. We performed all tests with 3ds Max's default scanline renderer, we enabled SSE support, and we rendered at HD 720p (1280x720) resolution. We measured the time it takes to render ten frames (frames 20 to 29).
As promised, we profiled our different benchmarks to understand them better. We performed profiling with AMD's CodeAnalyst; VTune profiling will follow later. 3ds Max runs four modules when you render:
- Render
(46% of the time)
- Ray-FX (28%)
- Geometry (15%)
- Core (11%)
To keep things simple, we summarized our findings with a weighted average over all modules.
3dsmax Profiling | |
Profile | Total |
Average IPC (on AMD 2350) | 1 |
Instruction mix | |
Floating Point | 39% |
SSE | 12% |
Branches | 13% |
L1 datacache ratio | 0.56 |
L1 Instruction ratio | 0.27 |
Performance indicators on Opteron 2350 | |
Branch misprediction | 6% |
L1 datacache miss | 1% |
L1 Instruction cache miss | 5% |
L2 cache miss | 0% |
As you can see, 3ds Max is mostly about floating-point performance with a bit of SSE instructions. It runs perfectly in the L1 and L2 cache of our CPUs. To make the graph easier to read we did not report our results in the classic way (rendering time) but expressed them in images rendered per hour (10 images * 3600 seconds divided by render time). Higher is therefore better.
The Xeon 5472 is about 8% faster than its older brother and widens the gap from the AMD Armada. We included quite a few results of older tests. This benchmark focuses on the CPU; chipset and RAM choices don't impact performance much. Interestingly, the Opteron 2350 is about as fast as four 2.4GHz single-core Opterons. Thus, in software with a "small dash" of SSE, the new architecture is about 20% faster. If we extrapolate our AMD quad-core results to 3GHz, the result would be about 59 images per second, which indicates that AMD's newest is about 10% slower than Intel clock for clock. That is no real surprise anymore: FLOPS showed us that the raw x87 FP and SSE power of AMD's latest architecture is slightly lower than the newest Xeon. It also can only overpower the Xeon 53xx if there are enough divisions involved. AMD's Barcelona architecture will only show a real advantage in bandwidth limited FP situations such as SPECfp2006 and many HPC applications.
43 Comments
View All Comments
aeternitas - Thursday, December 13, 2007 - link
Then why are you here? Details is what technology is about!I for one have a pet peeve with tech sites that use the wrong formats in their stories. Slightly damages credibility. Not to say this is a big deal in this case, though .gif is pretty much dead, unless you use an old browser on old tech, but then why would you be reading this story?
Look on the bright side, at least this isnt a Codec vs. Codec story, where the author uses jpgs for such color-limited screenshots.
SonicIce - Tuesday, November 27, 2007 - link
I think the color depth was decreased alot more than 8 bit. That image only has 33 unique colors in it. Something went wrong with the dithering maybe? 256 is usually more than enough.Justin Case - Friday, November 30, 2007 - link
Who cares? The only part that suffers is the gradient at the top, all the relevant information is there, and this file is about half the size of what a PNG would be.