On Wed, Sep 28, 2011 at 9:41 PM, canquan.shen <shencanquan@xxxxxxxxxx> wrote: > > On 2011/9/29 11:14, Chen Gong wrote: >> >> 于 2011/9/24 14:10, canquan.shen 写道: >>> >>> We run linux as a guest in Xen environment. When we used the xen tools >>> (xm vcpu-set <n>) to hot add and remove vcpu to and from the guest, we >>> encountered the failure on vcpu removal. We found the reason is that it >>> did't go to really remove cpu in the cpu removal code path. >>> >>> This patch adds acpi_bus_hot_remove_device in >>> acpi_process_hotplug_notify to >>> fix this issue. With this patch, it works fine for us. >>> >>> Signed-off-by: Canquan Shen <shencanquan@xxxxxxxxxx> >>> --- >>> drivers/acpi/processor_driver.c | 13 +------------ >>> drivers/acpi/scan.c | 4 ++-- >>> include/acpi/acpi_bus.h | 1 + >>> 3 files changed, 4 insertions(+), 14 deletions(-) >>> >>> diff --git a/drivers/acpi/processor_driver.c >>> b/drivers/acpi/processor_driver.c >>> index a4e0f1b..8429688 100644 >>> --- a/drivers/acpi/processor_driver.c >>> +++ b/drivers/acpi/processor_driver.c >>> @@ -665,18 +665,7 @@ static void >>> acpi_processor_hotplug_notify(acpi_handle handle, >>> case ACPI_NOTIFY_EJECT_REQUEST: >>> ACPI_DEBUG_PRINT((ACPI_DB_INFO, >>> "received ACPI_NOTIFY_EJECT_REQUEST\n")); >>> - >>> - if (acpi_bus_get_device(handle, &device)) { >>> - printk(KERN_ERR PREFIX >>> - "Device don't exist, dropping EJECT\n"); >>> - break; >>> - } >>> - pr = acpi_driver_data(device); >>> - if (!pr) { >>> - printk(KERN_ERR PREFIX >>> - "Driver data is NULL, dropping EJECT\n"); >>> - return; >>> - } >>> + acpi_bus_hot_remove_device(handle); >> >> As the description in __acpi_os_execute(in acpi_os_hotplug_execute), >> /* >> * We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq >> * because the hotplug code may call driver .remove() functions, >> * which invoke flush_scheduled_work/acpi_os_wait_events_complete >> * to flush these workqueues. >> */ >> If so, why not using following call: >> >> acpi_os_hotplug_execute(acpi_bus_hot_remove_device, acpi_device->handle); >> >> > > It is ok if using the acpi_os_hotplug_execute. but it is complex and more time for removal cpu because it is add to queue and some time the work will be called. > I think that it is clear to call directly acpi_bus_hot_remove_device function in acpi_processor_hotplug_notify. Chen, you're right that the CPU hot-remove notifier is running in a workqueue, we are proposing that the notifier call acpi_bus_hot_remove_device(), which calls the .remove() method, and that can cause a deadlock if .remove() waits for the workqueue to be flushed. The usual way this deadlock happens is when the .remove() method uses acpi_remove_notify_handler(). But the processor_driver .remove() method doesn't do anything with notify handler registration; the handler is registered/unregistered by the module init/exit functions. So I don't think we need to use acpi_os_hotplug_execute() in this case because I don't think there's a risk of deadlock. However, if new CPU devices appear in the namespace after module-init, I don't think processor_driver will handle them correctly. This looks like just another artifact of our screwed-up ACPI hotplug handling. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html