On 6 September 2024 20:39:03 CEST, Igor Pylypiv <ipylypiv@xxxxxxxxxx> wrote: >On Fri, Sep 06, 2024 at 04:27:21PM +0200, Niklas Cassel wrote: >> When ata_qc_complete() schedules a command for EH using >> ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to >> req->q->mq_ops->timeout() / scsi_timeout() being called. >> >> scsi_timeout(), if the LLDD has no abort handler (libata has no abort >> handler), will set host byte to DID_TIME_OUT, and then call >> scsi_eh_scmd_add() to add the command to EH. >> >> Thus, when commands first enter libata's EH strategy_handler, all the >> commands that have been added to EH will have DID_TIME_OUT set. >> >> libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that >> have not received a completion at the time of entering EH. >> >> Thus, we don't really care about DID_TIME_OUT at all, and currently clear >> the host byte at the end of EH, in ata_scsi_qc_complete(), before >> scsi_eh_finish_cmd() is called. >> >> ata_scsi_qc_complete() will be called both for commands that are completed >> normally (without going via EH), and for commands that went via EH. >> >> It seems more appropriate to clear DID_TIME_OUT at the start of EH instead >> of at the end of EH. That way, someone dumping the host byte at the middle >> of EH will not see DID_TIME_OUT as being set. No functional change >> intended. >> >> This has the additional advantage that we will not needlessly clear the >> host byte for commands that did not go via EH. >> >> Signed-off-by: Niklas Cassel <cassel@xxxxxxxxxx> > >Thank you for coming up with this generic solution, Niklas! > >I tested the patch with sg_sat_read_gplog. Without the fix we can see that >DID_TIME_OUT is being reported to user space. The issue is gone with the fix >applied. > >Without the fix: >---------------- >$ sg_sat_read_gplog -a 30 -p 9 --dma -vv /dev/sda >open /dev/sda with flags=0x802 >Building ATA READ LOG DMA EXT command; la=0x1e, pn=0x9, this_count=1 > ATA pass-through(16) cdb: [85 0d 0e 00 00 00 01 00 1e 00 09 00 00 00 47 00] >ATA pass-through(16): transport error: Host_status=0x03 [DID_TIME_OUT] >Driver_status=0x08 [DRIVER_SENSE] ^^^^^^^^^^^^ > >ATA pass-through(16): > Fixed format, current; Sense key: Illegal Request > ASC=24, ASCQ=00 (hex) > Info fld=0x4530001 [72548353] > Raw sense data (in hex), sb_len=64 > f0 00 05 04 53 00 01 0a 80 1e 09 00 24 00 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ATA PASS-THROUGH (16) failed > >$ echo $? >99 > >With the fix: >------------- >$ sg_sat_read_gplog -a 30 -p 9 --dma -vv /dev/sda >open /dev/sda with flags=0x802 >Building ATA READ LOG DMA EXT command; la=0x1e, pn=0x9, this_count=1 > ATA pass-through(16) cdb: [85 0d 0e 00 00 00 01 00 1e 00 09 00 00 00 47 00] >ATA pass through: >Fixed format, current; Sense key: Illegal Request >ASC=24, ASCQ=00 (hex) > Info fld=0x4530001 [72548353] > >$ echo $? >5 > > >Tested-by: Igor Pylypiv <ipylypiv@xxxxxxxxxx> Thank you for testing! Since you point out that this is actually solving a real issue (for ATA PT commands), we should add a Fixes-tag, and improve the commit message to mention that it fixes ATA PT commands. Will send a V2 on Monday, with your Reported-by tag. Have a nice weekend! Kind regards, Niklas