Hi Murali, On Wed, Feb 23, 2011 at 10:34 AM, Murali N <nalajala.murali@xxxxxxxxx> wrote: > Hi Dave, > thanks for your reply. ...snip... >> get_cpu_var returns the contents of a per-cpu variable. >> >> __get_cpu_var contains the actual machine-dependant implementation. It >> looks like all of the architectures use the one in >> asm-generic/percpu.h >> >> In general, all of the per-cpu data is gathered together into a >> section. Multiple sections are allocated (one per CPU). I think that >> the address of the variable is really the offset within the section, >> and each allocated section is cache-line aligned. This offset is then >> added to the "offset for my cpu" to come up with the final address of >> the variable, which is dereferenced as a pointer dereference. There >> are lots of extra doo-dads to get around warnings, and to prevent the >> linker from producing relocation references for for the variable >> access (since it looks like an access of a global variable, but it's >> really just doing a game of using the offset of the variable within >> the section). >> >> So you could think of it as a very fancy offsetof macro. >> >> There are several other macros involved, perhaps you could be a bit >> more specific about your request? >> >> Dave Hylands >> > > I have one more basic question. > Why would we need to maintain structures like this? Is there any > advantage we get here? Primarily for performance reasons. For example, the kernel maintains lots of stats on threads and processes (I haven't looked to see if these are actually maintained on a per-cpu basis, but the concept applies). these stats are updated frequently, but only accessed occaisonally. If you have a global "database" of stats, then each CPU needs to lock the data, which creates lots of contention. By keeping stuff per-cpu, the cpus don't need to acquire any locks (or at the very least won't cause as much contention when acquiring per-cpu locks). This becomes especially important when there are lots of cpus. The query functions can then amalgamate the information and present it as if it were maintained in a global database. So if you have data which is updated frequently and only accessed occaisonally, or updated infrequently and accessed frequently, then you might have a case for using per-cpu-data. Of course you'd still need to profile it and see if it makes sense. Also keep in mind, that some things might not seem like it matters much for say a dual-core, but could make a considerable difference with say 32 cores. Dave Hylands _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies