Re: memory barrier ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>>>> "Jan" == Jan Hudec <bulb@ucw.cz> writes:

    Jan> The mb() thing adds a "barrier" instruction to actualy stop
    Jan> CPU prefetch

Not quite, but quite close ;)

First, there are at least two established meanings for "prefetch":

  a) one related to the instruction stream and is controlled usually
     by a different set of instructions. For example see PowerPC
     "context synchronizing" insns like "isync" or "rfi".

  b) one related to the hints to the memory subsystem as to the
     likelyhood of subsequent accesses to certain address ranges, etc.

Note that both of these meanings are completely irrelevant to the
memory ordering issues: a) because it deals with insns[1], not data
and b) because it merely transfers data between different parts of the
memory hierachy.

The memory barriers stops the cpu from *reordering* memory accesses
across the barrier insn.  The cpu can still perform instruction
prefetch and the memory subsystem can still bring up data from memory
into closer caches.  The need for memory barriers is for scenarios as
the following:

Initial: A = a0, B = b0

CPU A       CPU B      CPU C
-----       -----      -----
A := a1     r0 := B     r0 := B
mb()        r1 := A     mb()
B := b1                 r1 := A


CPU A asserts that: (B == b1) => (A == a1) [2]

However, if CPU B reorders reads it can observe B == b1 && A == a0
which violates CPU A's assertion.  Moreover, on certain CPUs,
e.g. Alpha, it can observe A == a1 && B == b0 !!! No, really, I'm not
kidding.

CPU C performs the correct operations abd CPU A's assertion is valid
on CPU C.

    Jan> across the point. No such instruction is needed on IA32, but
    Jan> it's needed on some other arch with agressive prefetch
    Jan> (eg. sparc IIRC).

IA32 architecture includes CPU with very different memory orderings:
e.g. in i486 and Pentium read misses can go ahead of buffered writes,
which are cache hits. On PII, PIII, P4 and Xeon CPUs reads can be
arbitrarily reordered.

[1] Well, actually context synchronizing insn *can* be memory
synchronizing too, but the is more of a side-effect (albeit
intentional) of the implementation than a conceptual difference or
similarity.

[2] "=>" here means "implies".

~velco
--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux