On Tue, 2011-12-20 at 12:58 +0530, Srivatsa S. Bhat wrote: > On 12/20/2011 11:57 AM, Al Viro wrote: > > > On Tue, Dec 20, 2011 at 10:26:05AM +0530, Srivatsa S. Bhat wrote: > >> Oh, right, that has to be handled as well... > >> > >> Hmmm... How about registering a CPU hotplug notifier callback during lock init > >> time, and then for every cpu that gets onlined (after we took a copy of the > >> cpu_online_mask to work with), we see if that cpu is different from the ones > >> we have already locked, and if it is, we lock it in the callback handler and > >> update the locked_cpu_mask appropriately (so that we release the locks properly > >> during the unlock operation). > >> > >> Handling the newly introduced race between the callback handler and lock-unlock > >> code must not be difficult, I believe.. > >> > >> Any loopholes in this approach? Or is the additional complexity just not worth > >> it here? > > > > To summarize the modified variant of that approach hashed out on IRC: > > > > * lglock grows three extra things: spinlock, cpu bitmap and cpu hotplug > > notifier. > > * foo_global_lock_online starts with grabbing that spinlock and > > loops over the cpus in that bitmap. > > * foo_global_unlock_online loops over the same bitmap and then drops > > that spinlock > > * callback of the notifier is going to do all bitmap updates. Under > > that spinlock. Events that need handling definitely include the things like > > "was going up but failed", since we need the bitmap to contain all online CPUs > > at all time, preferably without too much junk beyond that. IOW, we need to add > > it there _before_ low-level __cpu_up() calls set_cpu_online(). Which means > > that we want to clean up on failed attempt to up it. Taking a CPU down is > > probably less PITA; just clear bit on the final "the sucker's dead" event. > > * bitmap is initialized once, at the same time we set the notifier > > up. Just grab the spinlock and do > > for_each_online_cpu(N) > > add N to bitmap > > then release the spinlock and let the callbacks handle all updates. > > > > I think that'll work with relatively little pain, but I'm not familiar enough > > with the cpuhotplug notifiers, so I'd rather have the folks familiar with those > > to supply the set of events to watch for... > > > > > We need not watch out for "up failed" events. It is enough if we handle > CPU_ONLINE and CPU_DEAD events. Because, these 2 events are triggered only > upon successful online or offline operation, and these notifications are > more than enough for our purpose (to update our bitmaps). Also, those cpus > which came online wont start running until these "success notifications" > are all done, which is where we do our stuff in the callback (ie., try > grabbing the spinlock..). > > Of course, by doing this (only looking out for CPU_ONLINE and CPU_DEAD > events), our bitmap will probably be one step behind cpu_online_mask > (which means, we'll still have to take the snapshot of cpu_online_mask and > work with it instead of using for_each_online_cpu()). > But that doesn't matter, as long as: > * we don't allow the newly onlined CPU to start executing code (this > is achieved by taking the spinlock in the callback) I think cpu notifier callback doesn't always run on the UPing cpu. Actually, it rarely runs on the UPing cpu. If I was wrong about the above thought, there is still a chance that lg-lock operations are scheduled on the UPing cpu before calling the callback. > * we stick to our bitmap while taking and releasing the spinlocks. > > Both of these have been handled in the design proposed above. So we are good > to go I guess. > > I am working on translating all these to working code.. Will post the patch > as soon as I'm done. > > Regards, > Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html