On Thu, Sep 22, 2011 at 5:16 PM, Prarit Bhargava <prarit@xxxxxxxxxx> wrote: > This patch resolves a process hang in which a PCI card with a PCI-to-PCI > bridge is removed via the acpiphp driver. > > The issue is that during the remove of the card an ACPI event occurs > which is erroneously added the kacpi_notify_wq queue. This add is done in > acpi_ev_queue_notify_request() via a call to acpi_os_execute(). > > Eventually, the event runs on the kacpi_notify_wq and the code attempts to > remove the card. During the hotplug remove of the device, the following > call sequence happens > > cleanup_p2p_bridge() > -> cleanup_bridge() > -> acpi_remove_notify_handler() > -> acpi_os_wait_events_complete() > -> flush_workqueue(kacpi_notify_wq) > > which is the queue we are currently executing on and the process will hang. > > The problem is that the event is placed on the wrong queue. The code already > contains a kacpi_hotplug_wq queue for hotplug events to be added to in order > to prevent hangs like this. Ugh. I see the issue, and I see how this patch works around it. But I don't really like it because we're adding complexity and (I think) extending knowledge of the special hotplug queue into the CA, when I don't think we should have added the hotplug queue in the first place. I think the main reason we need the hotplug queue (added in c02256be79a) is that we have driver .remove() methods calling acpi_os_wait_events_complete() via acpi_remove_notify_handler(). I would argue that drivers should not be using acpi_remove_notify_handler() in the first place. Adding/removing notify handlers should be done by the ACPI core itself. The acpiphp notify handlers (handle_hotplug_event_bridge() and handle_hotplug_event_func()) are for bus-level, device-independent events that should be handled not by the driver, but by the ACPI core. Obviously, the ACPI core doesn't have hotplug support today (it has nice "TBDs" in the EJECT_REQUEST/DEVICE_CHECK cases), so I guess all I can do is complain without proposing a good alternative. I hate that, sorry :) I think this is a very important issue and merits a Bugzilla reference so we can revisit it in the future. > Cc: mjg@xxxxxxxxxx > Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx> > --- > drivers/acpi/acpica/evmisc.c | 13 ++++++++++--- > 1 files changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/acpi/acpica/evmisc.c b/drivers/acpi/acpica/evmisc.c > index d0b3318..a78aecc 100644 > --- a/drivers/acpi/acpica/evmisc.c > +++ b/drivers/acpi/acpica/evmisc.c > @@ -181,9 +181,16 @@ acpi_ev_queue_notify_request(struct acpi_namespace_node * node, > notify_info->notify.value = (u16) notify_value; > notify_info->notify.handler_obj = handler_obj; > > - status = > - acpi_os_execute(OSL_NOTIFY_HANDLER, acpi_ev_notify_dispatch, > - notify_info); > + /* Is this an Ejection Request? */ > + if (notify_value == 0x03) Don't you want ACPI_NOTIFY_EJECT_REQUEST here instead of "0x03"? I'm not confident that EJECT_REQUEST is the only value that might be a problem. For example, ACPI_NOTIFY_DEVICE_CHECK in handle_hotplug_event_{bridge,func} () leads to acpiphp_check_bridge(), which can also wind up calling acpi_remove_notify_handler(). > + status = > + acpi_os_hotplug_execute(acpi_ev_notify_dispatch, > + notify_info); > + else > + status = acpi_os_execute(OSL_NOTIFY_HANDLER, > + acpi_ev_notify_dispatch, > + notify_info); > + > if (ACPI_FAILURE(status)) { > acpi_ut_delete_generic_state(notify_info); > } > -- > 1.6.5.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html