On Tuesday, June 18, 2013 11:36:50 PM Jiang Liu wrote: > On 06/17/2013 07:39 PM, Rafael J. Wysocki wrote: > > On Monday, June 17, 2013 01:01:51 AM Jiang Liu wrote: > >> On 06/16/2013 05:20 AM, Rafael J. Wysocki wrote: > >>> On Saturday, June 15, 2013 10:17:42 PM Rafael J. Wysocki wrote: > >>>> On Saturday, June 15, 2013 09:44:28 AM Jiang Liu wrote: > >> [...] > >>>> When it returns from unregister_hotplug_dock_device(), nothing prevents it > >>>> from accessing whatever it wants, because ds->hp_lock is not used outside > >>>> of the add/del and hotplug_dock_devices(). So, the actual role of > >>>> ds->hp_lock (not the one that it is supposed to play, but the real one) > >>>> is to prevent addition/deletion from happening when hotplug_dock_devices() > >>>> is running. [Yes, it does protect the list, but since the list is in fact > >>>> unnecessary, that doesn't matter.] > >>>> > >>>>> If we simply use a flag to mark presence of registered callback, we > >>>>> can't achieve the second goal. > >>>> > >>>> I don't mean using the flag *alone*. > >>>> > >>>>> Take the sony laptop as an example. It has several PCI > >>>>> hotplug > >>>>> slot associated with the dock station: > >>>>> [ 28.829316] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB > >>>>> [ 30.174964] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM0 > >>>>> [ 30.174973] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM1 > >>>>> [ 30.174979] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2 > >>>>> [ 30.174985] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GFXA > >>>>> [ 30.175020] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GHDA > >>>>> [ 30.175040] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC0.DLAN > >>>>> [ 30.175050] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC1.DODD > >>>>> [ 30.175060] acpiphp_glue: _handle_hotplug_event_func: Bus check > >>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC2.DUSB > >>>>> > >>>>> So it still has some race windows if we undock the station while > >>>>> repeatedly rescanning/removing > >>>>> the PCI bus for \_SB_.PCI0.RP07.LPMB.LPM0 through sysfs interfaces. > >>> > >>> Which sysfs interfaces do you mean, by the way? > >>> > >>> If you mean "eject", then it takes acpi_scan_lock and hotplug_dock_devices() > >>> should always be run under acpi_scan_lock too. It isn't at the moment,t > >>> because write_undock() doesn't take acpi_scan_lock(), but this is an obvious > >>> bug (so I'm going to send a patch to fix it in a while). > >>> > >>> With that bug fixed, the possible race between acpi_eject_store() and > >>> hotplug_dock_devices() should be prevented from happening, so perhaps we're > >>> worrying about something that cannot happen? > >> Hi Rafael, > >> I mean the "remove" method of each PCI device, and the "power" method > >> of PCI hotplug slot here. > >> These methods may be used to remove P2P bridges with associated ACPIPHP > >> hotplug slots, which in turn will cause invoking of > >> unregister_hotplug_dock_device(). > >> So theoretical we may trigger the bug by undocking while repeatedly > >> adding/removing P2P bridges with ACPIPHP hotplug slot through PCI > >> "rescan" and "remove" sysfs interface, > > > > Why don't we make these things take acpi_scan_lock upfront, then? > Hi Rafael, > Seems we can't rely on acpi_scan_lock here, it may cause another > deadlock scenario: > 1) thread 1 acquired the acpi_scan_lock and tries to destroy all sysfs > interfaces for PCI devices. > 2) thread 2 opens a PCI sysfs which then tries to acquire the > acpi_scan_lock. Well, maybe, but you didn't explain how this was going to happen. What code paths are involved, etc. Quite frankly, I've already run out of patience, sorry about that. It looks like I need to go through the code and understand all of these problems myself. Yes, it will take time. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html