On Wed, 2016-12-07 at 08:55 -0800, Bart Van Assche wrote: > On 12/07/2016 08:48 AM, Bart Van Assche wrote: > > It's a known bug. Some time ago I posted a patch that serializes all > > scsi_device_set_state() calls but I have not yet found it in the list > > archives. However, that patch has not yet been merged. > > See also https://www.spinics.net/lists/linux-scsi/msg66966.html. > > Bart. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html Yes, however that patch does not fix Wei Fang's issue. In fact I just received a crash dump that appears to be the same thing. It looks like the rport went away right after the initial INQUIRY, so we set the state to SDEV_BLOCK and stop the queue, and then the scan code continues and sets the state back to SDEV_RUNNING. Then, when the devloss timer expires, we call scsi_target_unblock w/SDEV_TRANSPORT_OFFLINE, but the SDEV_RUNNING state prevents the queue from being restarted, so a subsequent command (i.e. the ALUA page 83 inquiry command) is stuck on the stopped queue. (The dump shows 3 devices on the target with queues running in SDEV_TRANSPORT_OFFLINE, and 1 device currently being scanned with the queue stopped in SDEV_RUNNING.) It seems to me the problem is that scsi_device_set_state() is allowing the caller to transition SDEV_BLOCK -> SDEV_RUNNING without actually restarting the queue and that should be an illegal transition. -Ewan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html