Hello, Alan. On Wed, Dec 25, 2013 at 10:29:30PM -0500, Alan Stern wrote: > Thanks. As I understand it (correct me if this is wrong), the problem > is that some subsystems wait for a freezable work queue or kthread to > carry out some job, and they do this as part of their resume pathway. > Obviously this leads to deadlock. It's not even that. Freezer is too big and ends up causing deadlock which doesn't have much to do with resuming in itself. The observed deadlock is between driver core resume and block layer removal paths. While the removal operation is kicked off by bus rescan during resume, it isn't an inherent part of libata resume. ie. any block device removal can race against driver core freezer path. > But I don't see how swinging to the other extreme (i.e., making no > kernel threads or work queues freezable) really solves anything. > Those things are freezable for real reasons; they do stuff that must > not happen while the system is in the middle of a sleep transition. Kernel freezer has a lot of similarities with BKL. I wouldn't call its eventual removal an extreme choice. > Thus... If a subsystem's resume pathway depends on something happening > which must not happen during a sleep transition, then something is > fundamentally broken. Again, the deadlock doesn't have much to do with "a subsystem's resume pathway". It's just freezer behaving as too big a lock used by too many subsystems which don't even need them eventually leading to messy deadlock. > Perhaps the problem could be solved by a finer-grained approach. For Yeah, we seem to agree on the fact that it needs to be finer grained > example, maybe some of these work queues or kthreads need to be frozen > only while the system is suspending, so they can safely be thawed when > the resume begins. Would that fix the problem that began this > discussion? but not how to achieve it. Kernel freezer's problem is that it's too big a mechanism for the given problem. What we need is a mechanism to plug a handful (from what I've seen, there aren't too many legit users) of kthreads / workqueues, but what we have is a mechanism which is integral part of the whole task and workqueue machineries. >From the beginning and still, it's used as a voodoo mechanism which somehow makes the machine "mostly stopped" before performing PM operations. That "mostly stopped" has never been well defined and always been a source of confusions causing people to just believe it's somehow all quiet and good without actually pondering what should happen - freezing of kthreads isn't atomic and doesn't have any inherent ordering among different kthreads / wqs && freezing kthread / wq doesn't mean the layer is actually empty or inactive. irq / bh paths aren't blocked at all. A common misconception seems to be that by marking the workers in the upper layers freezable, somehow things are blocked from all the sources and thus the actual device, which needs to be plugged across suspend, is safe. It's damaging because it'd work in many, if not most, cases but can never be made fully reliable. > I know that in the case of khubd, we really do want it to remain frozen > throughout the entire sleep transition. I could be wrong but the only legitimate use cases of kernel freezer that I've seen are drivers using it as a fast way to implement suspend/resume callbacks and khubd seems to fall in that category. I think it'd be actually healthier to implement it as a part of the usual PM notification / callback mechanism - a natural benefit would be helping consolidate system and runtime PM paths. The conversion should be fairly simple and we definitely can add helpers and whatnot if we end up having to convert more than a handful. The danger is that because we've been plugging so many sources, we're likely to have lower level queues which aren't properly plugged. Those are broken but could have been working most of the time. Removing freezing from the upper layers could expose those issues, so the conversion as a whole may have some challanges in that department. It has been quite a while since I went through PM paths of many different drivers so I'd defer the judgement to Rafael. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html