L2 Cache

This is a discussion on L2 Cache within the ASM x86 ASM 370 forums in Programming Languages category; Since the use of L1 and L2 cache seems to be much more on topic, I am hopeful that the following are valid questions I am about to attempt to figure out how the L1 and L2 cache are used on the Intel Q9450 core 2 quad package. The specifications state that there are 12 MB of L2 cache. In just a few places, it is acknowledged that 6 MB are allocated to each CPU pair of the 4 CPUs. What is missing is how the conflict is resolved if two CPUs from different pairs read and then write (differently ...

Go Back   Application Development Forum > Programming Languages > ASM x86 ASM 370

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-19-2008, 09:18 PM
Jerome H. Fine
Guest
 
Default L2 Cache

Since the use of L1 and L2 cache seems to be much more on topic, I
am hopeful that the following are valid questions

I am about to attempt to figure out how the L1 and L2 cache are used
on the Intel Q9450 core 2 quad package.

The specifications state that there are 12 MB of L2 cache. In just
a few places, it is acknowledged that 6 MB are allocated to each
CPU pair of the 4 CPUs.

What is missing is how the conflict is resolved if two CPUs from
different pairs read and then write (differently of course) to the same
memory location followed by a read of that same memory location.

The other question concerns L1 cache. In a few places, the
technical specifications seem to state that there is separate
L1 instruction and data cache for each of the 4 CPUs. Again,
nothing seems to be available as to how are conflicts resolved
when the same memory location is read and then written (again
differently) by different CPUs at the same time and again
followed by a read.

While I am confident that Intel must have solved the problem,
what is missing is the time penalty imposed by the solution,
if any.

Can anyone help with a few suggestions or else specify a link
or manual that provides some answers?

Sincerely yours,

Jerome Fine

Reply With Quote
  #2  
Old 08-20-2008, 02:21 AM
Hendrik van der Heijden
Guest
 
Default Re: L2 Cache

Jerome H. Fine schrieb:
> Since the use of L1 and L2 cache seems to be much more on topic, I
> am hopeful that the following are valid questions
>
> I am about to attempt to figure out how the L1 and L2 cache are used
> on the Intel Q9450 core 2 quad package.
>
> The specifications state that there are 12 MB of L2 cache. In just
> a few places, it is acknowledged that 6 MB are allocated to each
> CPU pair of the 4 CPUs.
>
> What is missing is how the conflict is resolved if two CPUs from
> different pairs read and then write (differently of course) to the same
> memory location followed by a read of that same memory location.


Google "MESI/MOESI protocol". Basically, a CPU can only write
to memory locations ("cache lines") in its cache, and before
writing, it asks all other CPUs caches to invalidate their copies
(if any) of that cache line first.

> While I am confident that Intel must have solved the problem,
> what is missing is the time penalty imposed by the solution,
> if any.


When you're doing it wrong (writing to the same memory location
from different CPUs all the time), the penalty is quite high.


Hendrik vdH

Reply With Quote
  #3  
Old 08-20-2008, 10:41 AM
Boon
Guest
 
Default Re: L2 Cache

Jerome H. Fine wrote:

> Since the use of L1 and L2 cache seems to be much more on topic, I
> am hopeful that the following are valid questions
>
> I am about to attempt to figure out how the L1 and L2 cache are used
> on the Intel Q9450 core 2 quad package.


Yorkfield core.
http://en.wikipedia.org/wiki/Intel_Core_2

> The specifications state that there are 12 MB of L2 cache. In just
> a few places, it is acknowledged that 6 MB are allocated to each
> CPU pair of the 4 CPUs.


Right. AFAIU, Intel's quad core CPUs are, in fact, two dual core CPUs
"glued" together (Multi-Chip Module). Each dual core package has 6 MB L2
which is shared by both cores.

> What is missing is how the conflict is resolved if two CPUs from
> different pairs read and then write (differently of course) to the same
> memory location followed by a read of that same memory location.
>
> The other question concerns L1 cache. In a few places, the
> technical specifications seem to state that there is separate
> L1 instruction and data cache for each of the 4 CPUs. Again,
> nothing seems to be available as to how are conflicts resolved
> when the same memory location is read and then written (again
> differently) by different CPUs at the same time and again
> followed by a read.
>
> While I am confident that Intel must have solved the problem,
> what is missing is the time penalty imposed by the solution,
> if any.
>
> Can anyone help with a few suggestions or else specify a link
> or manual that provides some answers?


The "Memory Ordering White Paper" might be worth a read.
http://www.intel.com/products/proces...als/318147.pdf

Maybe this one too?
Software Techniques for Shared-Cache Multi-Core Systems
http://softwarecommunity.intel.com/a...s/eng/2760.htm

References:
http://en.wikipedia.org/wiki/Cache_coherency
http://en.wikipedia.org/wiki/Bus_sniffing

You may also want to check comp.arch out.

Regards.

Reply With Quote
  #4  
Old 08-20-2008, 11:10 AM
Jerry Coffin
Guest
 
Default Re: L2 Cache

In article <48AB70CD.6050705@compsys.to>, spamtrap@crayne.org says...

[ ... ]

> What is missing is how the conflict is resolved if two CPUs from
> different pairs read and then write (differently of course) to the same
> memory location followed by a read of that same memory location.


Googling for something like "MESI cache coherence protocol" should give
some relevant results.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Reply With Quote
  #5  
Old 08-20-2008, 12:55 PM
malc
Guest
 
Default Re: L2 Cache

"Jerome H. Fine" <spamtrap@crayne.org> writes:

> Since the use of L1 and L2 cache seems to be much more on topic, I
> am hopeful that the following are valid questions
>
> I am about to attempt to figure out how the L1 and L2 cache are used
> on the Intel Q9450 core 2 quad package.
>
> The specifications state that there are 12 MB of L2 cache. In just
> a few places, it is acknowledged that 6 MB are allocated to each
> CPU pair of the 4 CPUs.
>
> What is missing is how the conflict is resolved if two CPUs from
> different pairs read and then write (differently of course) to the same
> memory location followed by a read of that same memory location.
>
> The other question concerns L1 cache. In a few places, the
> technical specifications seem to state that there is separate
> L1 instruction and data cache for each of the 4 CPUs. Again,
> nothing seems to be available as to how are conflicts resolved
> when the same memory location is read and then written (again
> differently) by different CPUs at the same time and again
> followed by a read.
>
> While I am confident that Intel must have solved the problem,
> what is missing is the time penalty imposed by the solution,
> if any.
>
> Can anyone help with a few suggestions or else specify a link
> or manual that provides some answers?
>


http://lwn.net/Articles/252125/

--
mailto:av1474@comtv.ru

Reply With Quote
  #6  
Old 08-20-2008, 06:54 PM
Wolfgang Kern
Guest
 
Default Re: L2 Cache


Jerome H. Fine asked:

[about multicore...]

I'm raely sorry Sir,
only if you write your own multithreading code for MP,
this may make sense.
Most applications around (beside a few games) care a fart about
MP and use just one core, so all these Quad-core CPU announcements
are just there to make money out of the unawere and nothing else (yet).

Windoze NT/VISTA and Linux VS+xxx may be capable of multy-core,
but which program developer would write code relied on this ?
(the todays answer is: noone until nobody, but future may change this).
__
wolfgang


Reply With Quote
  #7  
Old 08-20-2008, 10:24 PM
Harold Aptroot
Guest
 
Default Re: L2 Cache

"Wolfgang Kern" <spamtrap@crayne.org> wrote in message
news:g8i7cg$sca$1@newsreader2.utanet.at...
(snipped)
> Windoze NT/VISTA and Linux VS+xxx may be capable of multy-core,
> but which program developer would write code relied on this ?
> (the todays answer is: noone until nobody, but future may change this).


I do (and it is quite safe to assume at least a dualcore - nearly everyone
has them these days)
"relied on" is a bit strong perhaps, if you'd say "use" there would be a lot
more devs who did so.
Almost every program that even comes close to saturating 1 core is written
to use more cores when available - not doing so is a terrible waste of time
of the users and since there are usually many more users then there are
devs, those devs could easily spend a 100 times as much time programming for
a 299% speed increase (so 399 in total) (assuming quad core) and still net
time - and the users are likely to like it (very important) versus being
fustrated by a program that is slow because it fails to use more than 1
core.

Examples of programs that are good in this respect: Paint.NET (also happens
to be approx 30% faster in 64bit mode), x264, 7zip, POV-Ray, DivX, most
BOINC tasks, VLC (although it doesn't really need it tbh)

Reply With Quote
  #8  
Old 08-21-2008, 01:31 AM
Alexei A. Frounze
Guest
 
Default Re: L2 Cache

On Aug 20, 3:54 pm, "Wolfgang Kern" <spamt...@crayne.org> wrote:
> Most applications around (beside a few games) care a fart about
> MP and use just one core, so all these Quad-core CPU announcements
> are just there to make money out of the unawere and nothing else (yet).


True about applications, false about hardware. And actually, if you
think about it, even UP'ish code can benefit from running on MP
hardware. The apps' threads can and will be scheduled on different
CPUs, so they will take the advantage of MP.

Alex

Reply With Quote
  #9  
Old 08-21-2008, 01:53 AM
Chuck Crayne
Guest
 
Default Re: L2 Cache

On Wed, 20 Aug 2008 22:31:00 -0700 (PDT)
"Alexei A. Frounze" <spamtrap@crayne.org> wrote:

> The apps' threads can and will be scheduled on different
> CPUs, so they will take the advantage of MP.


And even a single-threaded app can and will benefit from multiple CPUs,
because it can have an entire CPU to itself, while the system functions
run on the other CPUs. For example, as I write this, my development
machine has 326 processes open.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html


Reply With Quote
  #10  
Old 08-21-2008, 03:52 AM
Boon
Guest
 
Default Re: L2 Cache

malc wrote:

> http://lwn.net/Articles/252125/


Thanks for the link.

The entire paper is 114-pages long. Good stuff.

http://people.redhat.com/drepper/cpumemory.pdf

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 07:21 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.