Re: Fwd: Locking with per-cpu variables

Keith Owens <kaos@xxxxxxxxxx> · Mon, 17 Oct 2005 02:20:01 +1000

On Sun, 16 Oct 2005 16:01:53 +0200, 
Aritz Bastida <aritzbastida@xxxxxxxxx> wrote:
>* I asked what would happen if I am resetting a counter (actually the
>whole struct with memset) in one CPU while the other one is updating
>it. What would be the worst thing it could happen?
>For example, if the counter is of type long (64 bytes), could it
>happen that one CPU updates upper 32 bytes while the other CPU clears
>lower 32 bytes, thus corrupting the counter's value?

Yes, you could get corrupt data in that case.  All architectures on
which Linux runs have a guarantee that "memory reads of native sized
variables with correct alignment are atomic".  What that gobbledegook
means is that if the hardware supports a native 4 byte integer and your
integers are on a 4 byte boundary then an integer read will always get
a valid value, even when another cpu is updating the field.  The cpu
reading the variable will either read the old value or the new value,
but not some bytes from the old value and some bytes from the new
value.

But (and it's a big but) the memory variable must be a natively
supported size on this architecture.  i386 hardware supports 1, 2 and 4
byte stores and reads as atomic.  It does _NOT_ support atomic reads of
8 byte quantities.  Reading an 8 byte field requires two 4 byte reads
and you can read garbage if the data is being updated from another cpu.

>* Keith talked about cache bouncing. This is what I want to avoid, and
>the reason for which I use per-cpu variables. The process of packets
>should be kept as much parallel as possible. But cache issues are very
>abstract (you don't "see" them when programming, they happen in
>hardware level) and I didnt find any good reference to learn about
>this. Is there any book, article or whatever, and if it's possible,
>oriented to Linux kernel programmers?

My bible on cache matters, which you will have to pry from my cold,
dead hands, is Unix Systems for Modern Architectures (Symmetric
Multiprocessing and Caching for Kernel Programmers) by Curt Schimmel,
ISBN 0-201-6338-8.

Bottom line for your case - as long as accesses to per-cpu variables
are done from the local cpu then you will not get any cache line
bounces.  Only when you set the 'clear' flag on all cpus from one cpu
will you get any bounces, but only for a couple of cycles, then it goes
back to being local data.  The 'clear' case is rare because you do not
zero the counters very often.  It can be ignored for normal ("the fast
path") processing.

-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html