On Tue, 2016-05-31 at 16:38 +0800, Wei Fang wrote: > sas_ata_strategy_handler() adds the works of the ata error handler > to system_unbound_wq. This workqueue asynchronously runs work items, > so the ata error handler will be performed concurrently on different > CPUs. In this case, ->host_failed will be decreased simultaneously in > scsi_eh_finish_cmd() on different CPUs, and become abnormal. > > It will lead to permanently inequal between ->host_failed and > ->host_busy, and scsi error handler thread won't become running. > IO errors after that won't be handled forever. > > Use atomic type for ->host_failed to fix this race. As I said previously, you don't need atomics to do this, could you just remove the decrement in scsi_eh_finish_command() and zero the counter after the strategy handler completes. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html