It is a good fix. We experience a similar issue [a bad drive was passing hard reset but would not answer a soft reset]. Without the patch the machine would hang. With it, the error is found. This problem has been introduced a long time ago, in commit 2cbb79ebbd4be07041368da5379a64f89f8ad518; it is there in 2.6.33, in drivers/ata/ahci.c. I can send you the trivial patch for it if you want, for stable kernel. Gwendal. On Tue, Aug 24, 2010 at 9:27 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:T > On 08/23/2010 05:11 PM, Anssi Hannula wrote: >> On Monday 23 August 2010 12:31:32 Tejun Heo wrote: >>> Hello, >>> >>> On 08/22/2010 11:10 PM, Anssi Hannula wrote: >>>> 22:52:18 : ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe >>>> frozen >>>> 22:52:18 : ata6: irq_stat 0x00400040, connection status changed >>>> 22:52:18 : ata6: SError: { RecovComm PHYRdyChg CommWake DevExch } >>>> 22:52:18 : ata6: hard resetting link >>>> 22:52:28 : ata6: softreset failed (device not ready) >>>> 22:52:28 : ata6: hard resetting link >>>> 22:52:38 : ata6: softreset failed (device not ready) >>>> 22:52:38 : ata6: hard resetting link >>>> 22:52:49 : ata6: link is slow to respond, please be patient (ready=0) >>>> 22:53:13 : ata6: softreset failed (device not ready) >>>> 22:53:13 : ata6: limiting SATA link speed to 1.5 Gbps >>>> 22:53:13 : ata6: hard resetting link >>>> ===================== >>>> I disconnect the drive for a few moments, but nothing is output by >>>> kernel. I reconnect it again, but again, nothing is output by the >>>> kernel. I run: echo "- - -" > >>>> /sys/devices/pci0000:00/0000:00:1f.2/host5/scsi_host/host5/scan >>>> However, it appeared stuck and still no messages in the kernel log, so >>>> I disconnected the device again. Still nothing is output, and the >>>> following messages started to be output, indicating that the process >>> >>>> had become stuck: >>> Looks like EH got stuck somehow. Maybe the timeout calculation is >>> wrong? Can you please trigger sysrq-t while the system is stuck and >>> post the result? >> >> Ok, here's the output. And the system is not stuck, just the bash process that >> is writing to 'scan' file. >> >> In this occasion, the hard reset had been stuck for some 16 hours, and it is >> on ata5 (scsi4): > > Does the following patch fix the problem? > > Thanks. > > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c > index 666850d..68dc678 100644 > --- a/drivers/ata/libahci.c > +++ b/drivers/ata/libahci.c > @@ -1326,7 +1326,7 @@ int ahci_do_softreset(struct ata_link *link, unsigned int *class, > /* issue the first D2H Register FIS */ > msecs = 0; > now = jiffies; > - if (time_after(now, deadline)) > + if (time_after(deadline, now)) > msecs = jiffies_to_msecs(deadline - now); > > tf.ctl |= ATA_SRST; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ide" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html