Hi again, >> >> Hey, i did not want to trigger an eartquake in the basement of the kernel ;-) >> My intention was to prevent some crashes, and help developers to find their bugs. >> I think my patch exactly does this. > > Hehe, actually drivers not being able to block unbind has been bugging me > for > a while now, because there are cases where this would be really helpful. >>> 1) make resources refcounted, have child resources take a ref on the parent >>> 2) Disallow unbind on devices with bound child-devices? >>> >> Exactly what i was thinking of in first attempts. >> But i fear that would break even more use cases. >> >> Hans, directly regarding the driver: >> The problem i see is that the xhci_intel_unregister_pdev which is added >> as an action with devm_add_action_or_reset() is called late by the framework, >> later than the usb_hcd_pci_remove() in xhci_pci_remove. >> Is there any chance to trigger this before? >> This is what Greg meant with "right order". > > Ah, I missed that part, sure that should be easy, just stop using > devm_add_action_or_reset() and do the xhci_intel_unregister_pdev() > manually at the right time. The downside of this is that you also > need to make sure it happens at the right time from probe error-paths > but given the bug you are hitting, I guess that is probably > already a problem. > @Hans: Sure, will have a look at this. I think i have found where to do that, but need to check how to get the pdev pointer there .... @Greg: I am still confident that my patch in __release_region should be taken in. Situation now without my patch: If we have a device driver (or whatever) releasing a resource, the owner of the child will have no notification that the parent is gone. Accessing the parent (at least this will happen when trying to free the resource) might have changed memory at the parent location, and what happens might be an access to unmapped memory, whatever - an oops and we don't know why. That's what i experienced and hunting. Situation with the patch applied: The owner gets a notification (parent=NULL) and we have an indication in the kernel log. If an owner of the resource where the parent is gone checks for the parent, we are fine. If he doesn't check: we have a NULL pointer deref with a warning message pointing to the root cause. Isn't it better to have a pointer to a crash rather than having unreliable racy crashes in such a case? Have a nice weekend. Best regards Carsten