On Wed, Dec 07, 2016 at 07:04:33PM -0500, Keith Busch wrote: > On Wed, Dec 07, 2016 at 05:40:54PM -0600, Bjorn Helgaas wrote: > > On Sat, Nov 19, 2016 at 12:32:47AM -0800, Ashok Raj wrote: > > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > > @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct *work) > > > switch (info->req) { > > > case DISABLE_REQ: > > > mutex_lock(&p_slot->hotplug_lock); > > > + p_slot->state = POWEROFF_STATE; > > > > It sounds right that p_slot->state should be protected. > > > > It looks like handle_button_press_event() and > > pciehp_sysfs_enable_slot() hold p_slot->lock while updating > > p_slot->state. > > > > You're setting "state = POWEROFF_STATE" while holding > > p_slot->hotplug_lock (not p_slot->lock). Four lines down, we set > > "state = STATIC_STATE", but this time we're holding p_slot->lock. > > > > What is the difference between the p_slot->lock and > > p_slot->hotplug_lock? Do we need both? How do we know which one to > > use? > > > > I'm very confused. > > This is _very_ confusing. :) > > The p_slot->hotplug_lock serializes a power off and a power on event. We > only want to set the state when one of those events gets to execute > rather than when the event is queued. Changing the state when the event > is queued can trigger a never ending up-down sequence. > > It currently looks safe to nest the p_slot->lock under the > p_slot->hotplug_lock if that is you recommendation. I'm not making a recommendation; that would take a lot more thought than I've given this. There are at least three locks in pciehp: struct controller.ctrl_lock struct slot.lock struct slot.hotplug_lock There shouldn't really be any performance paths in pciehp, so I'm pretty doubtful that we need such complicated locking. > Alternatively we could fix this if we used an ordered work queue for > the slot's work, but that is a significantly more complex change. You mean we can currently execute things from the work queue in a different order than they were enqueued? That sounds ... difficult to analyze, to say the least. I don't know much about work queues, and Documentation/workqueue.txt doesn't mention alloc_ordered_workqueue(). Is that what you are referring to? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html