On Tue, Oct 23, 2012 at 10:46 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 23 Oct 2012, Ming Lei wrote: > >> On Mon, Oct 22, 2012 at 10:33 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> > >> > Tail recursion should be implemented as a loop, not as an explicit >> > recursion. That is, the function should be: >> > >> > void pm_runtime_set_memalloc_noio(struct device *dev, bool enable) >> > { >> > do { >> > dev->power.memalloc_noio_resume = enable; >> > >> > if (!enable) { >> > /* >> > * Don't clear the parent's flag if any of the >> > * parent's children have their flag set. >> > */ >> > if (device_for_each_child(dev->parent, NULL, >> > dev_memalloc_noio)) >> > return; >> > } >> > dev = dev->parent; >> > } while (dev); >> > } >> >> OK, will take the non-recursion implementation for saving kernel >> stack space. >> >> > >> > except that you need to add locking, for two reasons: >> > >> > There's a race. What happens if another child sets the flag >> > between the time device_for_each_child() runs and the next loop >> > iteration? >> >> Yes, I know the race, and not adding a lock because the function >> is mostly called in .probe() or .remove() callback and its parent's device >> lock is held to avoid this race. >> >> Considered that it may be called in async probe() (scsi disk), one lock >> is needed, the simplest way is to add a global lock. Any suggestion? > > No. Because of where you put the new flag, it must be protected by > dev->power.lock. And this means the iterative implementation shown > above can't be used as is. It will have to be more like this: > > void pm_runtime_set_memalloc_noio(struct device *dev, bool enable) > { > spin_lock_irq(&dev->power.lock); > dev->power.memalloc_noio_resume = enable; > > while (dev->parent) { > spin_unlock_irq(&dev->power.lock); > dev = dev->parent; > > spin_lock_irq(&dev->power.lock); > /* > * Don't clear the parent's flag if any of the > * parent's children have their flag set. > */ > if (!enable && device_for_each_child(dev->parent, NULL, s/dev->parent/dev > dev_memalloc_noio)) > break; > dev->power.memalloc_noio_resume = enable; > } > spin_unlock_irq(&dev->power.lock); > } With the problem of non-SMP-safe bitfields access, the power.lock should be held, but that is not enough to prevent children from being probed or disconnected. Looks another lock is still needed. I think a global lock is OK in the infrequent path. > >> > Even without a race, access to bitfields is not SMP-safe >> > without locking. >> >> You mean one ancestor device might not be in active when >> one of its descendants is being probed or removed? > > No. Consider this example: > > struct foo { > int a:1; > int b:1; > } x; > > Consider what happens if CPU 0 does "x.a = 1" at the same time as > another CPU 1 does "x.b = 1". The compiler might produce object code > looking like this for CPU 0: > > move x, reg1 > or 0x1, reg1 > move reg1, x > > and this for CPU 1: > > move x, reg2 > or 0x2, reg2 > move reg2, x > > With no locking, the two "or" instructions could execute > simultaneously. What will the final value of x be? > > The two CPUs will interfere, even though they are touching different > bitfields. Got it, thanks for your detailed explanation. Looks the problem is worse than above, not only bitfields are affected, the adjacent fields might be involved too, see: http://lwn.net/Articles/478657/ Thanks, -- Ming Lei -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>