Locks and the FSB

Elad Lahav <elad_lahav@xxxxxxxxxxxxxxxxxxxxx> · Wed, 26 Nov 2008 17:22:13 -0500

I am looking into some scalability issues on a 4-way Xeon machine (4 separate CPUs, not 
cores). I believe I have tracked down the problem to bus contention: OProfile results 
suggest a strong correlation between instructions reporting a high number of 
global_power_events and FSB_data_activity events.
Some of these events can be easily explained. However, what surprises me, is that certain 
lock operations seem to cause considerable lock activity. For example, a call to 
spin_unlock_irqsave() from e1000_xmit_frame(). The strange thing about it is that the 
experiments I am conducting strictly partition the NICs among the CPUs (interrupt and 
process affinity), so that there is no contention on the lock (verified with lockstat).
My understanding suggests that the variable of a lock that is only accessed by a single 
CPU should be constantly in the CPU's cache in Modified mode, as no other CPU is ever 
invalidating it, and thus there should be little if any FSB activity due to access to this 
variable. It should be noted that the number of FSB events per-cpu increases considerably 
when moving from 3 to 4 CPUs, while the number of cache misses stays roughly the same.
Is my understanding correct? Are there any other reasons for FSB activity related to locks?

--Elad

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ