Re: ahci port hangs while hard resetting link

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It is a good fix. We experience a similar issue [a bad drive was
passing hard reset but would not answer a soft reset]. Without the
patch the machine would hang. With it, the error is found. This
problem has been introduced a long time ago, in commit
2cbb79ebbd4be07041368da5379a64f89f8ad518; it is there in 2.6.33, in
drivers/ata/ahci.c. I can send you the trivial patch for it if you
want, for stable kernel.

Gwendal.

On Tue, Aug 24, 2010 at 9:27 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:T
> On 08/23/2010 05:11 PM, Anssi Hannula wrote:
>> On Monday 23 August 2010 12:31:32 Tejun Heo wrote:
>>> Hello,
>>>
>>> On 08/22/2010 11:10 PM, Anssi Hannula wrote:
>>>> 22:52:18 : ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe
>>>> frozen
>>>> 22:52:18 : ata6: irq_stat 0x00400040, connection status changed
>>>> 22:52:18 : ata6: SError: { RecovComm PHYRdyChg CommWake DevExch }
>>>> 22:52:18 : ata6: hard resetting link
>>>> 22:52:28 : ata6: softreset failed (device not ready)
>>>> 22:52:28 : ata6: hard resetting link
>>>> 22:52:38 : ata6: softreset failed (device not ready)
>>>> 22:52:38 : ata6: hard resetting link
>>>> 22:52:49 : ata6: link is slow to respond, please be patient (ready=0)
>>>> 22:53:13 : ata6: softreset failed (device not ready)
>>>> 22:53:13 : ata6: limiting SATA link speed to 1.5 Gbps
>>>> 22:53:13 : ata6: hard resetting link
>>>> =====================
>>>> I disconnect the drive for a few moments, but nothing is output by
>>>> kernel. I reconnect it again, but again, nothing is output by the
>>>> kernel. I run: echo "- - -" >
>>>> /sys/devices/pci0000:00/0000:00:1f.2/host5/scsi_host/host5/scan
>>>> However, it appeared stuck and still no messages in the kernel log, so
>>>> I disconnected the device again. Still nothing is output, and the
>>>> following messages started to be output, indicating that the process
>>>
>>>> had become stuck:
>>> Looks like EH got stuck somehow.  Maybe the timeout calculation is
>>> wrong?  Can you please trigger sysrq-t while the system is stuck and
>>> post the result?
>>
>> Ok, here's the output. And the system is not stuck, just the bash process that
>> is writing to 'scan' file.
>>
>> In this occasion, the hard reset had been stuck for some 16 hours, and it is
>> on ata5 (scsi4):
>
> Does the following patch fix the problem?
>
> Thanks.
>
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index 666850d..68dc678 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -1326,7 +1326,7 @@ int ahci_do_softreset(struct ata_link *link, unsigned int *class,
>        /* issue the first D2H Register FIS */
>        msecs = 0;
>        now = jiffies;
> -       if (time_after(now, deadline))
> +       if (time_after(deadline, now))
>                msecs = jiffies_to_msecs(deadline - now);
>
>        tf.ctl |= ATA_SRST;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux