On Fri, 2013-02-15 at 13:49 +0100, Rafael J. Wysocki wrote: > On Thursday, February 14, 2013 05:28:02 PM Toshi Kani wrote: > > On Fri, 2013-02-15 at 01:23 +0100, Rafael J. Wysocki wrote: > > > On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote: > > > > > > > > > -----Original Message----- > > > > > From: Rafael J. Wysocki [mailto:rjw@xxxxxxx] > > > > > Sent: Thursday, February 14, 2013 12:59 PM > > > > > To: Moore, Robert > > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu; > > > > > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@xxxxxxxxxxxxxxx > > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and > > > > > memory leaks > > > > > > > > > > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Rafael J. Wysocki [mailto:rjw@xxxxxxx] > > > > > > > Sent: Thursday, February 14, 2013 4:04 AM > > > > > > > To: Moore, Robert > > > > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang > > > > > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; > > > > > > > linux-pci@xxxxxxxxxxxxxxx > > > > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues > > > > > > > and memory leaks > > > > > > > > > > > > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote: > > > > > > > > > > > I thought about that, but actually there's no guarantee that > > > > > > > > > > > the handle will be valid after _EJ0 as far as I can say. So > > > > > > > > > > > the race condition is going to be there anyway and using > > > > > > > > > > > struct acpi_device just makes it easier to avoid it. > > > > > > > > > > > > > > > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0 > > > > > > > > > > performs unload table and if ACPICA frees up its internal data > > > > > > > > > > structure pointed by the handle as a result. But we should > > > > > > > > > > not see such issue now since we do not support dynamic ACPI > > > > > > > > > > namespace > > > > > > > yet. > > > > > > > > > > > > > > > > > > I'm waiting for information from Bob about that. If we can > > > > > > > > > assume ACPI handles to be always valid, that will simplify > > > > > > > > > things quite a > > > > > > > bit. > > > > > > > > > > > > > > > > If a table is unloaded, all the namespace nodes for that table are > > > > > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go > > > > > > > > stale > > > > > > > and invalid. > > > > > > > > > > > > > > OK, thanks! > > > > > > > > > > > > > > To me this means that we cannot assume a handle to stay valid > > > > > > > between a notify handler and acpi_bus_hot_remove_device() run from a > > > > > workqueue. > > > > > > > > > > > > > > Is there a mechanism in ACPICA to ensure that a handle won't become > > > > > > > stale while a notify handler is running for it or is the OS > > > > > > > responsible for ensuring that > > > > > > > _EJ0 won't be run in parallel with notify handlers for device > > > > > > > objects being ejected? > > > > > > > > > > > > > > > > > > > It is up to the host. > > > > > > > > > > I was afraid that that might be the case. :-) > > > > > > > > > > So far the (Linux) host has been happily ignoring that potential problem, > > > > > so I guess it can still be ignored for a while, although we'll need to > > > > > address it eventually at one point. > > > > > > > > I would think it should be fairly simple to setup a mechanism to either tell > > > > the driver or for the driver to figure it out -- such that the driver knows > > > > that all handles associated with the device are now invalid. Another way > > > > to look at it is that when the device is re-installed, the driver should > > > > reinitialize such that it obtains new handles for the devices and subobjects > > > > in question. > > > > > > Unfortunately, there is quite strong assumption in our code that ACPI handles > > > will not become stale before the device objects associated with them are > > > removed. For this reason, we need to know in advance which handles will > > > become stale as a result of a table unload and remove their device objects > > > beforehand. > > > > > > Moreover, when there's a notify handler installed for a given ACPI handle > > > and that handle becomes stale while the notify handler is running, we'll be > > > in trouble. To avoid that we need to ensure that table unloads and notifies > > > will always be mutually exclusive. > > > > I wonder if we can make acpi_ns_validate_handle() to actually be able to > > verify if a given handle is valid. This way, ACPICA can fail gracefully > > (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces. > > That'd be good, but to implement it, I think, it would be necessary to > introduce some reference counting of namespace objects such that the given > object would only be deleted after the last reference to it had been dropped. > On table unload it would just be marked as invalid, but it would stay in > memory as long as there were any references to it. > > So, for example, a notify handler would start from something like > acpi_add_reference(handle), which would guarantee that the object pointed to by > handle would stay in memory, and it would finish by doing > acpi_drop_reference(handle) or a work item scheduled by it would do that. > > We do that for objects based on struct device and it works well. There is other way to implement it. Since acpi_handle is defined as an opaque value, this could be changed to an index to an array of pointers, instead of a direct pointer. Then we can safely invalidate an index by invalidating the pointer associated with the index. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html