HI Murali, On Wed, Feb 23, 2011 at 11:26 AM, Murali N <nalajala.murali@xxxxxxxxx> wrote: > Hi Dave, > > On Wed, Feb 23, 2011 at 11:15 AM, Dave Hylands <dhylands@xxxxxxxxx> wrote: >> Hi Murali, >> >> On Wed, Feb 23, 2011 at 10:34 AM, Murali N <nalajala.murali@xxxxxxxxx> wrote: >>> Hi Dave, >>> thanks for your reply. >> ...snip... >>>> get_cpu_var returns the contents of a per-cpu variable. >>>> >>>> __get_cpu_var contains the actual machine-dependant implementation. It >>>> looks like all of the architectures use the one in >>>> asm-generic/percpu.h >>>> >>>> In general, all of the per-cpu data is gathered together into a >>>> section. Multiple sections are allocated (one per CPU). I think that >>>> the address of the variable is really the offset within the section, >>>> and each allocated section is cache-line aligned. This offset is then >>>> added to the "offset for my cpu" to come up with the final address of >>>> the variable, which is dereferenced as a pointer dereference. There >>>> are lots of extra doo-dads to get around warnings, and to prevent the >>>> linker from producing relocation references for for the variable >>>> access (since it looks like an access of a global variable, but it's >>>> really just doing a game of using the offset of the variable within >>>> the section). >>>> >>>> So you could think of it as a very fancy offsetof macro. >>>> >>>> There are several other macros involved, perhaps you could be a bit >>>> more specific about your request? >>>> >>>> Dave Hylands >>>> >>> >>> I have one more basic question. >>> Why would we need to maintain structures like this? Is there any >>> advantage we get here? >> >> Primarily for performance reasons. For example, the kernel maintains >> lots of stats on threads and processes (I haven't looked to see if >> these are actually maintained on a per-cpu basis, but the concept >> applies). these stats are updated frequently, but only accessed >> occaisonally. If you have a global "database" of stats, then each CPU >> needs to lock the data, which creates lots of contention. By keeping >> stuff per-cpu, the cpus don't need to acquire any locks (or at the >> very least won't cause as much contention when acquiring per-cpu >> locks). This becomes especially important when there are lots of cpus. >> >> The query functions can then amalgamate the information and present it >> as if it were maintained in a global database. >> >> So if you have data which is updated frequently and only accessed >> occaisonally, or updated infrequently and accessed frequently, then >> you might have a case for using per-cpu-data. Of course you'd still >> need to profile it and see if it makes sense. >> >> Also keep in mind, that some things might not seem like it matters >> much for say a dual-core, but could make a considerable difference >> with say 32 cores. >> >> Dave Hylands >> > > So it make sense to use if i am running on more cores ( > 4 ). It really depends on the access patterns of the data. Whether it makes sense or not is something you'll probably need to profile (i.e. with and without using per-cpu variables). Dave Hylands _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies