On Tuesday, 25 of December 2007, Rafael J. Wysocki wrote: > On Monday, 24 of December 2007, Alan Stern wrote: > > On Mon, 24 Dec 2007, Rafael J. Wysocki wrote: > > > > > Hi, > > > > > > Some device drivers register CPU hotplug notifiers and use them to destroy > > > device objects when removing the corresponding CPUs and to create these objects > > > when adding the CPUs back. > > > > > > Unfortunately, this is not the right thing to do during suspend/hibernation, > > > since in that cases the CPU hotplug notifiers are called after suspending > > > devices and before resuming them, so the operations in question are carried > > > out on the objects representing suspended devices which shouldn't be > > > unregistered behing the PM core's back. Although right now it usually doesn't > > > lead to any practical complications, it will predictably deadlock if > > > gregkh-driver-pm-acquire-device-locks-prior-to-suspending.patch is applied. > > > > > > The solution is to prevent drivers from removing/adding devices from within > > > CPU hotplug notifiers during suspend/hibernation using the FROZEN bit > > > in the notifier's action argument. The following three patches modify the > > > MSR, x86-64 MCE and cpuid drivers along these lines. > > > > Do we need to worry about the possibility that when the system wakes up > > from hibernation, the set of usable CPUs might be smaller than it was > > beforehand? > > This is possible in error conditions. > > > Is any special handling needed for this, or is it already accounted for? > > Hm, well. The cleanest thing would be to allow the drivers to remove the > device objects on CPU_UP_CANCELED_FROZEN, which means that we weren't able to > bring the CPU up during a resume, but still that will deadlock with > gregkh-driver-pm-acquire-device-locks-prior-to-suspending.patch. Hmm. In principle, device objects may be destroyed on CPU_UP_CANCELED_FROZEN without acquiring the device locks, since in fact we know these objects won't be accessed concurrently at that time (the locks are already held by the PM core, but the PM core is not going to actually access the devices before the subsequent resume). Comments? Thanks, Rafael _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm