Re: [sata_sil] kernel 2.6.17(-mm2) test - timeout issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Martin Ammermüller wrote:
I tried the patch, but i couldn't see any changes in kerneloutput. I
also noticed, that there are actually two slightly different
error-messages.

#1 (shorter one, without HSM violation):
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: (BMDMA stat 0x21)
ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout)

DRDY (device ready), DMA engine active but no DRQ (data request), while READ DMA - seems like a packet loss/corruption during data transfer to me, but set DRDY is a bit weird.

ata1: port is slow to respond, please be patient
ata1: port failed to respond (30 secs)

Again, weird. libata times out waiting for DRDY, which is weird because DRDY was set when the timeout occurred (as reported above) but when EH reset code is executed (which should have followed immediately), the code sees !DRDY and times out waiting for it.

ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete

SRST successfully recovers the device in this case.

#2 (longer, with HSM violation):
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x2 frozen

SErr is reporting handshake error (R_ERR seen) diagnostic bit but not reporting any error bit.

ata1.00: (BMDMA stat 0x20)
ata1.00: tag 0 cmd 0xc8 Emask 0x2 stat 0x58 err 0x0 (HSM violation)

DMA engine off and DRDY && DRQ. Again, looks like data transmission error but considering data transmission direction is from device to host (READ_DMA), the error status is confusing.

ata1: soft resetting port
ata1: port is slow to respond, please be patient
ata1: port failed to respond (30 secs)

prereset saw set DRDY this time but after SRST, BSY is stuck at 1.

ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ata1.00: qc timeout (cmd 0xec)

We should really fail softreset if ata_busy_sleep() fails in ata_bus_post_reset(). In this case, softreset reports success after timeout causing the following revalidation to timeout too.

ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete

hardreset successfully recovers the device.

Hmmm.. with the patch, sata_sil should have tried hardreset at the first time in the second case. There's our third weirdity. I think the problem can be worked around by...

1. having shorter timeout value on READ/WRITE commands. 30s is *way* too long.

2. making reset procedure more intelligent. There's no reason to wait full 30s before and after softreset if it's not for hotplug. It should switch to hardreset if the device doesn't respond in several secs. Being responsive && giving device enough time eventually shouldn't be too difficult.

#1 shouldn't be difficult but we need to be careful. #2 might take some time to implement.

I'm not sure why the previous patch didn't kick in. The condition should have been caught and EH_HARDRESET requested. Can you please double check the patched kernel is running? You can put a little printk() after the freeze: label in sil_interrupt() to be sure. That's the only place where sata_sil freezes the port except for timing out.

Thanks.

--
tejun
-
: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux