On Tuesday 08 December 2009, Rafael J. Wysocki wrote: > On Tuesday 08 December 2009, Rafael J. Wysocki wrote: > > On Tuesday 08 December 2009, Rafael J. Wysocki wrote: > > > On Tuesday 08 December 2009, Linus Torvalds wrote: > > > > > > > > On Tue, 8 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > > > Anyway, if we use an rwsem, it won't be checkable from interrupt context just > > > > > as well. > > > > > > > > You can't do a lock() from an interrupt, but the unlocks should be > > > > irq-safe. > > > > > > > > > Suppose we use rwsem and during suspend each child uses a down_read() on a > > > > > parent and then the parent uses down_write() on itself. What if, whatever the > > > > > reason, the parent is a bit early and does the down_write() before one of the > > > > > children has a chance to do the down_read()? Aren't we toast? > > > > > > > > We're toast, but we're toast for a totally unrealted reason: it means that > > > > you tried to resume a child before a parent, which would be a major bug to > > > > begin with. > > > > > > > > Look, I even wrote out the comments, so let me repeat the code one more > > > > time. > > > > > > > > - suspend time calling: > > > > // This won't block, because we suspend nodes before parents > > > > down_read(node->parent->lock); > > > > // Do the part that may block asynchronously > > > > async_schedule(do_usb_node_suspend, node); > > > > > > > > - resume time calling: > > > > // This won't block, because we resume parents before children, > > > > // and the children will take the read lock. > > > > down_write(leaf->lock); > > > > // Do the blocking part asynchronously > > > > async_schedule(usb_node_resume, leaf); > > > > > > > > See? So when we take the parent lock for suspend, we are guaranteed to do > > > > so _before_ the parent node itself suspends. And conversely, when we take > > > > the parent lock (asynchronously) for resume, we're guaranteed to do that > > > > _after_ the parent node has done its own down_write. > > > > > > > > And that all depends on just one trivial thing; that the suspend and > > > > resume is called in the right order (children first vs parent first > > > > respectively). And that is such a _major_ correctness issue that if that > > > > isn't correct, your suspend isn't going to work _anyway_. > > > > > > Understood (I think). > > > > > > Let's try it, then. Below is the resume patch based on my previous one in this > > > thread (I have only verified that it builds). > > > > Ah, I need to check if dev->parent is not NULL before trying to lock it, but > > apart from this it doesn't break things at least. > > For completness, below is the full async suspend/resume patch with rwlocks, > that has been (very slightly) tested and doesn't seem to break things. > > [Note to Alan: lockdep doesn't seem to complain about the not annotated nested > locks.] BTW, I can easily change it so that it uses completions for synchronization, but I'm not sure if that's worth spending time on, so please let me know. Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html