On Thu, Aug 10, 2023 at 09:48:48AM +0800, linan666@xxxxxxxxxxxxxxx wrote: > From: Li Nan <linan122@xxxxxxxxxx> > > interrupt scsi_eh > > ahci_error_intr > =>ata_port_freeze > =>__ata_port_freeze > =>ahci_freeze (turn IRQ off) > =>ata_port_abort > =>ata_port_schedule_eh > =>shost->host_eh_scheduled++; > host_eh_scheduled = 1 > scsi_error_handler > =>ata_scsi_error > =>ata_scsi_port_error_handler > =>ahci_error_handler > . =>sata_pmp_error_handler > . =>ata_eh_thaw_port > . =>ahci_thaw (turn IRQ on) > ahci_error_intr . > =>ata_port_freeze . > =>__ata_port_freeze . > =>ahci_freeze (turn IRQ off) . > =>ata_port_abort . > =>ata_port_schedule_eh . > =>shost->host_eh_scheduled++; . > host_eh_scheduled = 2 . > =>ata_std_end_eh > =>host->host_eh_scheduled = 0; Hello Li Nan, I do not understand why the code in: https://github.com/torvalds/linux/blob/v6.5-rc7/drivers/ata/libata-eh.c#L722-L731 does not kick in, and repeats EH. EH_PENDING is cleared before ->error_handler() is called: https://github.com/torvalds/linux/blob/v6.5-rc7/drivers/ata/libata-eh.c#L697 So ahci_error_intr() from the second error interrupt, which is called after thawing the port, should have called ata_std_sched_eh(), which calls ata_eh_set_pending(), which should have set EH_PENDING: https://github.com/torvalds/linux/blob/v6.5-rc7/drivers/ata/libata-eh.c#L884 My only guess is that after thawing the port: https://github.com/torvalds/linux/blob/v6.5-rc7/drivers/ata/libata-eh.c#L2807 The second error irq comes, and sets EH_PENDING, but then this silly code might clear it: https://github.com/torvalds/linux/blob/v6.5-rc7/drivers/ata/libata-eh.c#L2825-L2837 I think the best way would be if we could improve this "spurious error condition check"... because if this is indeed the code that clears EH_PENDING for you, then this code basically makes the "goto repeat" code in ata_scsi_port_error_handler() useless... An alternative to improving the "spurious error condition check" might be for you to try something like: diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 35e03679b0bf..82f032934ae1 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -962,7 +962,7 @@ void ata_std_end_eh(struct ata_port *ap) { struct Scsi_Host *host = ap->scsi_host; - host->host_eh_scheduled = 0; + host->host_eh_scheduled--; } EXPORT_SYMBOL(ata_std_end_eh); ...and see if that improves things for you. Kind regards, Niklas