On 10/28/16 19:08, James Bottomley wrote: > This is a deadlock caused by an inversion issue in kernfs (suicide vs > non-suicide removes); so fixing it in SCSI alone really isn't > appropriate. I count at least five other subsystems all using this > mechanism, so they'll all be similarly affected. It looks to be fairly > simply fixable inside kernfs, so please fix it that way. Hello James, Can you clarify this further? To me this looks like the result of how the SCSI core works rather than an issue in the kernfs layer. My interpretation of the deadlock report produced by the lockdep code is as follows: * The SCSI scanning code holds scan_mutex while creating sysfs attributes for a SCSI device. In this case scan_mutex is the outer mutex and s_active the inner locking object. * scsi_remove_host() holds scan_mutex while removing sysfs attributes. Also in this case scan_mutex is the outer mutex and s_active the inner locking object. * During self-removal (sysfs_remove_file_self() being called indirectly by kernfs_fop_write()), kernfs_fop_write() holds s_active while scsi_remove_device() is being called. In this case s_active is the outer locking object and scan_mutex the inner locking object. I think that it is essential that kernfs_fop_write() holds s_active. So to me this looks like a lock inversion issue that cannot be fixed by modifying kernfs only. In other words, the SCSI core has to be modified to fix this. Do you agree with this? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html