On Fri, 2016-11-04 at 12:17 -0600, Bart Van Assche wrote: > On 11/04/2016 07:47 AM, James Bottomley wrote: > > You know after > > > > if (device_remove_file_self(dev, attr)) > > > > returns true that s_active is held and also that KERNFS_SUICIDAL is > > set on the node, so the non-self remove paths can simply check for > > this flag and return without descending into __kernfs_remove(), > > which would mean they never take s_active. That means we never get > > the inversion. > > Hello James, > > The lock inversion is not triggered by the non-self remove paths but > by the self-remove path. I think we should agree first what the problem is. The inversion occurs between the sysfs delete path and the device node delete caused by a remove host. When both are happening the inversion is that when if (device_remove_file_self(dev, attr)) scsi_remove_device(to_scsi_device(dev)); Happens, after the if, the s_active lock is held then scsi_remove_device goes on to take the scan_mutex. Conversely in scsi_remove_host, the mutex is taken first then scsi_forget_host iterates removing the devices, but sysfs file removal eventually takes s_active in kernfs_drain, which is called from kernfs_remove via kernfs_remove_by_name_ns, hence the inversion. This is therefore a conflict between the self and non-self remove paths. > Anyway, can you have a look at the two attached > patches? Well, they're still overly complex, but perhaps due to the misunderstanding above? If you look through that trigger case, you'll see that the fix is simply to check KERNFS_SUICIDAL in kernfs_remove_by_name_ns and not descend into __kernfs_remove() if it's set. I think kernfs_mutex mediates this, but probably only if it's moved lower down in kernfs_drain. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html