David Miller wrote: > From: Nick Piggin <nickpiggin@xxxxxxxxxxxx> > Date: Wed, 27 Aug 2008 17:47:14 +1000 > >> Yeah, I see. That's stupid isn't it? (Well, I guess it was completely >> sane when cpumasks were word sized ;)) >> >> Hopefully that accounts for a significant chunk... > > There is a lot of indirect costs that are hard to see as well. > > Two things a lot of these cross-call dispatch paths do is: > > 1) Clear self-cpu > > 2) AND with cpus_online > > #1 can normally be a simple bit clear, but some places can also > implement this with something like "cpus_andn(X, cpumask_of_cpu(cpu))" > > It's simply easier to move those two things down to the bottom of > the APIC programming code, they just loop over the cpumask doing > an expensive APIC I/O operation anyways, might as well overlap it > with these "skip self-cpu" and "skip not-online cpus" checks. > > And oh yeah we get the stack wastage fixed too, isn't what what we > were talking about? :-) Yes, the most time consuming part was determining whether a kmalloc could safely be used in the context of the function, and what to do about the out-of-memory problem. Pushing that down to something like: for_each_cpu_thats_online(cpu, *maskptr) would remove the need for many of the temp masks. A simple if (cpu != me) would take care of excluding self. It might have better interaction with cpu hotplug as well, since the online map would be checked just before the call to that cpu is made. Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html