On 09/23/2010 12:17 AM, michaelc@xxxxxxxxxxx wrote:
From: Mike Christie<michaelc@xxxxxxxxxxx> If a lld does: ret = fc_block_scsi_eh(cmnd); if (ret) return ret; in the eh callbacks, then it could cause the following race: 1 the LLD will call fc_block_scsi_eh from the scsi eh thread. 2 From the FC class thread, the fast io fail tmo will fire and set FC_RPORT_FAST_FAIL_TIMEDOUT, then begin to call terminate_rport_io. 3 The scsi eh thread and the LLD will then break from the fc_block_scsi_eh block and will return FAST_IO_FAIL. 4 The scsi eh will then assume it owns the command and will start to process it. It will call scsi_eh_flush_done_q which might fail it or retry it. 5 But then in the FC class thread, the LLD terminate_rport_io callback could be processing the IO and possibly accessing a scsi_cmnd struct that the scsi eh thread has now started to retry or failed and reallocated to a new request in #4. This patch has fc_block_scsi_eh wait until the terminate_rport_io callback has completed before returning. This allows LLDs to not have to worry about the race.
I think this is not going to work. It looks like for drivers like lpfc and even qla2xxx in the ISP_ABORT case, because even after terminate_rport_io has completed the driver can still touch the scsi_cmnd.
-- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html