On Thu, 2016-11-03 at 16:27 -0600, Bart Van Assche wrote: > On 10/28/2016 08:08 PM, James Bottomley wrote: > > This is a deadlock caused by an inversion issue in kernfs (suicide > > vs > > non-suicide removes); so fixing it in SCSI alone really isn't > > appropriate. I count at least five other subsystems all using this > > mechanism, so they'll all be similarly affected. It looks to be > > fairly > > simply fixable inside kernfs, so please fix it that way. > > Hello James, > > How about fixing this deadlock with the below patch? Without a description file, it's a bit hard to follow, but reading the patch it's somewhat cumbersome and also incomplete: you change the API for self removal, so every driver that uses the device_remove_file_self() scheme (about five of them) has to be changed. I was actually thinking of something much simpler: You know after if (device_remove_file_self(dev, attr)) returns true that s_active is held and also that KERNFS_SUICIDAL is set on the node, so the non-self remove paths can simply check for this flag and return without descending into __kernfs_remove(), which would mean they never take s_active. That means we never get the inversion. This should be about a 10 line patch because there are only two places this check needs to be done: kernfs_remove() and kernfs_remove_by_name_ns(). It has the advantage that the api isn't altered so all the other driver callers simply pick up the benefit. There may be a bit of back end cleanup to do on the eventual parent directory removal because I think it needs to wait for all the suicidal nodes to signal they've finally suicided. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html