On Wed, 17 May 2006, Steven Rostedt wrote: > My first attempt to fix this introduced another dereference, to allow > for modules to allocate their own memory. This was quickly shot down, > and for good reason, because dereferences kill performance, and don't > play nice with large SMP systems that depend on per_cpu being fast. > I now place the per_cpu variables into VM, such that the pages are > only allocated when needed. All the architecture needs to do is > supply a VM address range, size for each CPU to use (note this > implementation expects all the VM CPU areas to be together), and > three functions to allow for allocating page tables at bootup. So now instead of an explicit indirection we use an implicit one through the page tables for this. This happens during early boot which requires additional page table functions? And it requires the use of an additional TLB entry? I guess that the additional TLB pressure alone will result in a performance drop of 3%? See http://www.gelato.unsw.edu.au/archives/linux-ia64/0602/17311.html