On Wed, Jul 01, 2009 at 07:21:57PM +0900, Tejun Heo wrote: > > using possible per cpu data I picked in current git: icmp.c > > I was talking about percpu allocator proper. Yeap, the major work > would be in auditing and converting for_each_possible_cpu() users. and testing. that's the hard part. cpu hotplug is normally not well tested. Any code change that requires lots of new code for it will be a problem because that code will then likely bitrot. > > kfree(net->ipv4.icmp_sk); > > net->ipv4.icmp_sk = NULL; > > } > > > > You would need to convert that to use a CPU notifier and callbacks > > setting up the sockets. Then make sure there are no races in all of > > this. And get it somehow tested (where is the user base who > > tests cpu hotplug?) > > Maybe it would be better to allocate percpu sockets as proper percpu > variables. That would be like percpu dentries. A socket is a very complex structure. Trying to pull it out of its normal allocators would be a bad idea. I actually had a patch some time ago to move this one over to callbacks, but it was already difficult, and I dropped it. > Initialization would still need callback mechanism tho. I > was thinking about adding @init callback to percpu_alloc(), which > would be much simpler than doing full cpu hotplug callback. I'm not sure it will be actually simpler. It's not just initializing the structure, but all the other set up to make it known to the world. > > > And there is lots of similar code all over the tree > > For static percpu variables, it'll be mostly about converting > for_each_possible_cpu() to for_each_used_cpu() as both allocation and > initialization can be handled by percpu proper. For dynamic areas, That's the easy part, but you would still need all the callbacks for the extension case. > allocation can be handled by percpu proper but cpus coming online > would need more work to convert. It'll take some effort but there > aren't too many alloc_percpu() users yet and I don't think it will be alloc_percpu()? It affects all DEFINE_PER_CPU users. My current tree has hundreds. > too difficult. I wouldn't know for sure before I actually try tho. I think it's clear that you haven't tried yet :) I wrote quite a few per cpu callback handlers over the years and in my experience they are all nasty code with subtle races. The problem is that instead of having a single subset init function which is just single threaded and doesn't need to worry about races you now have multi threaded init, which tends to be a can of worms. I think a far saner strategy than rewriting every user of DEFINE_PER_CPU, ending up with lots of badly tested code is to: - Fix the few large size percpu pigs that are problematic today to allocate in a callback. - Then once the per cpu data in all configurations is <200k (better <100 in the non debug builds) again just keep pre-allocating like we always did - Possibly adjust the vmalloc area on 32bit based on the possible CPU map at the cost of the direct mapping, to make sure there's always enough mapping space. -Andi -- ak@xxxxxxxxxxxxxxx -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html