From: Li Nan <linan122@xxxxxxxxxx> If a disk is removed and quickly inserted when an I/O error is processing, the disk may not be able to be re-added. The function call timeline is as follows: interrupt scsi_eh ahci_error_intr ata_port_freeze __ata_port_freeze =>ahci_freeze (turn IRQ off) ata_port_abort ata_do_link_abort ata_port_schedule_eh =>ata_std_sched_eh ata_eh_set_pending set EH_PENDING scsi_schedule_eh shost->host_eh_scheduled++ (=1) scsi_error_handler =>ata_scsi_error ata_scsi_port_error_handler clear EH_PENDING =>ahci_error_handler . sata_pmp_error_handler . ata_eh_reset . ata_eh_thaw_port . . =>ahci_thaw (turn IRQ on) ahci_error_intr . . ata_port_freeze . . __ata_port_freeze . . =>ahci_freeze (turn IRQ off) . . ... . . ata_eh_set_pending . . set EH_PENDING . . scsi_schedule_eh . . shost->host_eh_scheduled++ (=2) . . . clear EH_PENDING check EH_PENDING =>ata_std_end_eh host->host_eh_scheduled = 0; 'host_eh_scheduled' is 0 and scsi eh thread will not be scheduled again. The ata port remains frozen and will never be enabled. To fix this issue, decrease 'host_eh_scheduled' instead of setting it to 0 so that EH is scheduled again to re-enable the port. Also move the update of 'nr_active_links' to 0 when 'host_eh_scheduled' is 0 to ata_scsi_port_error_handler(). Reported-by: luojian <luojian5@xxxxxxxxxx> Signed-off-by: Li Nan <linan122@xxxxxxxxxx> --- Changes in v3: - change patch title, previously it was: "scsi: ata: Fix a race condition between scsi error handler and ahci interrupt". - drop the variable 'host' in ata_std_end_eh(). - improve commit message. drivers/ata/libata-eh.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 159ba6ba19eb..2d5ecd68b7e0 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -735,6 +735,12 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap) */ ap->ops->end_eh(ap); + if (!ap->scsi_host->host_eh_scheduled) { + /* make sure nr_active_links is zero after EH */ + WARN_ON(ap->nr_active_links); + ap->nr_active_links = 0; + } + spin_unlock_irqrestore(ap->lock, flags); ata_eh_release(ap); @@ -946,9 +952,7 @@ EXPORT_SYMBOL_GPL(ata_std_sched_eh); */ void ata_std_end_eh(struct ata_port *ap) { - struct Scsi_Host *host = ap->scsi_host; - - host->host_eh_scheduled = 0; + ap->scsi_host->host_eh_scheduled--; } EXPORT_SYMBOL(ata_std_end_eh); @@ -3922,10 +3926,6 @@ void ata_eh_finish(struct ata_port *ap) } } } - - /* make sure nr_active_links is zero after EH */ - WARN_ON(ap->nr_active_links); - ap->nr_active_links = 0; } /** -- 2.39.2