AMD's Opteron hits 3.2GHz

Name: AMD's Opteron hits 3.2GHz
Item: AMD's Opteron hits 3.2GHz
Author: Johan De Gelas

by Johan De Gelas on August 6, 2007 3:00 AM EST

Posted in
IT Computing

30 Comments | Add A Comment

30 Comments

The Secret Boost of the Opteron 2224

Socket F Opterons have a small secret weapon: a speed bump offers more than just a faster CPU. To understand this, take a look at the table below. We measured the L2 cache's bandwidth with Lavalys Everest 3.51.

Lavalys Everest 3.51 L2 Bandwidth
	Read (MB/s)	Write (MB/s)	Copy (MB/s)
Dual Xeon 5160 3.0 GHz	22019	17751	23628
Xeon E5345 2.33 GHz	17610	14878	18291
Opteron 2224 SE 3.2 GHz	14636	12636	14630
Opteron 8218HE 2.6 GHz	11891	10266	11891

The L2 cache of the Opteron 8218 at 2.6GHz is slower than the Core 2's L2 cache at 2.33. At about 10-11 GB/s it barely matches the theoretical peak bandwidth that DDR2 at 667MHz can deliver (10.6 GB/s), while its exclusive nature also forces it to exchange quite a bit of data with the L1 cache. Now combine this table with the following one, where we measured memory bandwidth.

Lavalys Everest 3.51 Memory Bandwidth
	Read (MB/s)	Write (MB/s)	Copy (MB/s)	Latency (ns)
Dual Xeon 5160 3.0 GHz	3656	2771	3800	112.2
Xeon E5345 2.33 GHz	3578	2793	3665	114.9
Opteron 2224 SE 3.2 GHz	7466	6980	6863	58.9
Opteron 8218HE 2.6 GHz	6944	6186	5895	64

It is no secret that a higher clocked integrated memory controller can increase the actual delivered bandwidth of the same DDR2 modules. But it also helps that the L2 cache is able to swallow the bandwidth that the memory is capable of delivering. Also notice that without the use of SSE2 instructions, the memory subsystem of the 5000p chipset delivers relatively disappointing amounts of bandwidth. As most applications do not use carefully tuned SSE2 code to get data from memory, this should reflect the real world situation most of the time. And of course, until Intel introduces the Nehalem family, memory latency will continue to be one of the strong points of AMD.

Processor Latency Comparison
CPU	L1	L2	L3	min mem	max mem	Absolute latency (ns)
Xeon 5160 3.0 - DDR2 533	3	14		69	380	127
Xeon 5160 3.0 - DDR2 667	3	14		67	338	113
Core 2 Duo 2.933 - DDR2 533	3	14		67	180	61
Quad Xeon E5345 2.33 - DDR2 533	3	14		80	280	120
Quad Xeon E5345 2.33 - DDR2 667	3	14		80	271	116
Xeon 7130M 3.2 - DDR2 400	4	29	109	245	624	195
Opteron 880 2.4 - DDR333	3	12		84	228	95
Opteron 2224 SE - DDR2 667	3	12		72	189	59
Opteron 2218 HE - DDR2 667	3	12		62	157	60

The latency penalty that FB-DIMM introduces is huge. To get an idea, we added the latency measured with a Core 2 Duo 2.933 using 2x 2GB 533MHz DDR2. The staggering conclusion is that registered FB-DIMMs add - in the worst case - about 200 cycles or 66ns of latency. Sure, some of that latency can be attributed to the buffering which is necessary for server memory. Buffered memory contains registers which will actually hold data for one full clock cycle before it's passed on. So this means that registered memory should add about 8ns (2 clock cycles at 266MHz base clock, DDR2-533).

The secondary benefit of FB-DIMMs is that motherboards can use more DIMMs per bank, potentially increasing total memory capacity. AMD already gets around this quite easily with up to eight DIMM sockets per CPU socket, however, so this benefit really doesn't materialize in any reasonable form. The bottom line is that while FB-DIMMs were a potentially good idea from a purely theoretical point of view, it is rather obvious that in practice they have some pretty bad consequences.

Tyan Transport TA26 SPECjbb2005

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

30 Comments

View All Comments

2ManyOptions - Monday, August 6, 2007 - link
... for most of the benchmarks Intel chips performed better than the Opterons, don't know why Intel should get scared from these, they can safely wait for Barcelona. Didn't really understand why you have out it as AMD is still in game with these in the 4S space.
baby5121926 - Monday, August 6, 2007 - link
intel got scared because they dont want to see the real result from AMD + ATI.
the longer intel lets AMD lives, the more dangerous intel will be.
that's why you guys can see Intel is attacking AMD really really hard at this meantime... just to kick AMD out of the game.
Justin Case - Monday, August 6, 2007 - link
What are the units in the WinRAR results table?
coldpower27 - Monday, August 6, 2007 - link
Check Intel own pricing lists, and you will see that Intel has already pre-empted some of these cuts with their Xeon X5355 at $744 or Xeon E5345 at $455 and the "official" Xeon X5365 should be cout soon if not already...

http://www.intel.com/intel/finance/pricelist/proce...">http://www.intel.com/intel/finance/pric...rice_lis...
TheOtherRizzo - Monday, August 6, 2007 - link
I know nothing about 4S servers. But what's the essence of this article? Surely not that NetBurst is crap? We've known that for years. Is the real story here that Intel doesn't really give a s*** about 4S, otherwise they would have moved on to the core 2 architecture long ago? Just guessing.
coldpower27 - Monday, August 6, 2007 - link
Xeon 7300 Series based on the Tigerton core which is a 4 Socket Capable Kentsfield/Clovertown derivatives is arriving in Sepetember this year, so Intel does care in becoming more competitive in the 4S space, but it is just taking some time.

They decided to concentrate on the high volume 2S sector is all first, since Intel has massive capacity, going for the high volume sector first makes sense.
mino - Monday, August 13, 2007 - link
Yes and no, actually to have two intel quads running on a single FSB was a serious technical problem.

Therefore they had to wait for 4-FSB chipset to be able to get them out the door. Not to mention the qualification times which are a bit onger for 4S platforms that 2S.

AMD does not have these obstacles as 8xxx series are essentially 2xxx series from stability/reliability POW.
Calin - Monday, August 6, 2007 - link
The 5160 processor is Core2 unit, not a NetBurst one. Also, the 5345 is a quad core based on Core2
jay401 - Monday, August 6, 2007 - link
People built 3.0GHz - 3.33GHz E4300 & E4400 systems six months ago that cost roughly $135 for the CPU. Others went for an E6300 or more recently an E6320, both again under $200.
They were all relatively easy overclocks.

Why does anyone with any skill in building their own computer care about an $800+ CPU again?
Calin - Monday, August 6, 2007 - link
Why don't Ford Mustangs use a small engine, overclocked to hell? Like an inline 4 2.0l with turbo, and a high rpm instead of their huge 4+ liter engines?
Why do trucks use those big engines, when they could get the same power from a smaller, gasoline, turbocharged engine?

People pay $800+ for processors that work in multiprocessor systems (your run of the mill Athlon64 or E4300 won't run). Also, they use error checking (and usually error correcting) memory in their systems - again, Athlon64 doesn't do this. They also use registered DDR in order to access more memory banks - your Athlon64 again falls short. On the E4300 side, the chipset is responsible with those things, so you could use such a processor in a server chassis - if the socket fits.

AMD's Opteron hits 3.2GHz

Post Your Comment

30 Comments

View All Comments

2ManyOptions - Monday, August 6, 2007 - link

baby5121926 - Monday, August 6, 2007 - link

Justin Case - Monday, August 6, 2007 - link

coldpower27 - Monday, August 6, 2007 - link

TheOtherRizzo - Monday, August 6, 2007 - link

coldpower27 - Monday, August 6, 2007 - link

mino - Monday, August 13, 2007 - link

Calin - Monday, August 6, 2007 - link

jay401 - Monday, August 6, 2007 - link

Calin - Monday, August 6, 2007 - link

Log in

Don't have an account? Sign up now