Re: Problem w/ hotplug on sata_sil24 w/ PMP (sil3726)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Forgot to remove make it plain text, sorry for the spam.

Gwendal.

On Wed, Oct 5, 2011 at 11:05 PM, Gwendal Grignou <gwendal@xxxxxxxxxx> wrote:
>
> Forgot gmail is not great to send patches, used git send-email instead.
> Gwendal.
>
> On Wed, Oct 5, 2011 at 10:48 PM, Gwendal Grignou <gwendal@xxxxxxxxxx> wrote:
>>
>> I think I know what is going on. One of your disks at least is slow to
>> spinup. Due to a bug/feature in silicon image disk controller and pmp,
>> at bring up we can not issue a SOFT_RESET and wait for the disk to
>> spinup and then continue.
>> That why we set ATA_LFLAG_NO_SRST in sata_pmp_quirks().
>> So what happen is we go into a function that issue identify, but we
>> fail, the disk is not ready [it is spinning up], so we retry.
>> 3 times.
>>
>> From the first hard reset: 12888.470385, to the time you got the final
>> error: 12901.397305 ~ 12.9s
>> In the second case, your controller can send SOFT_RESET and wait for
>> the device to respond.
>> Time for the disk to spinup:
>> 28010.630028 - 27997.097116 ~ 13.5s
>> As you can see, you are borderline with the PMP, but the controller
>> did not "wait" enough in the first case.
>> Given the spinup time varies with drive, age, time since last
>> spin-up..., it may work one day and fail the next.
>> To work around the problem, I have a patch that consist of allowing
>> the silicon image control to send a reset, but if it fails, we spin
>> for a fixed amount of time and retry. This is not very nice, it is a
>> better design to wait for event that waiting a fixed amount of time.
>> You may have to alter ATA_LFLAG_WAIT_SRST to use the first bit available.
>>
>> Can you try with the following patch?
>>
>> Thanks,
>> Gwendal.
>>
>> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
>> index 228740f..b98b02d 100644
>> --- a/drivers/ata/libata-eh.c
>> +++ b/drivers/ata/libata-eh.c
>> @@ -2798,7 +2798,14 @@ int ata_eh_reset(struct ata_link *link, int classify,
>>      sata_scr_read(link, SCR_STATUS, &sstatus))
>>   rc = -ERESTART;
>>
>> - if (rc == -ERESTART || try >= max_tries)
>> + if (try >= max_tries)
>> + goto out;
>> +
>> + /* Some PMP will not serve SRST until the disk is spunup,
>> + * if the controller can not wait for the PMP to acknowledge the frame,
>> + * wait here */
>> + if (rc == -ERESTART &&
>> +    !((lflags & ATA_LFLAG_WAIT_SRST) && (reset == softreset)))
>>   goto out;
>>
>>   now = jiffies;
>> @@ -2813,6 +2820,8 @@ int ata_eh_reset(struct ata_link *link, int classify,
>>   delta = schedule_timeout_uninterruptible(delta);
>>   }
>>
>> + if (rc == -ERESTART)
>> + goto out;
>>   if (try == max_tries - 1) {
>>   sata_down_spd_limit(link, 0);
>>   if (slave)
>> diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c
>> index 00305f4..d21ad7d 100644
>> --- a/drivers/ata/libata-pmp.c
>> +++ b/drivers/ata/libata-pmp.c
>> @@ -325,13 +351,11 @@ static void sata_pmp_quirks(struct ata_port *ap)
>>   if (vendor == 0x1095 && devid == 0x3726) {
>>   /* sil3726 quirks */
>>   ata_for_each_link(link, ap, EDGE) {
>> - /* Class code report is unreliable and SRST
>> - * times out under certain configurations.
>> - */
>> + /* Class code report is unreliable */
>> + /* PMP does not forward SRST until the drive spins up */
>>   if (link->pmp < 5)
>> - link->flags |= ATA_LFLAG_NO_SRST |
>> -       ATA_LFLAG_ASSUME_ATA;
>> -
>> + link->flags |= ATA_LFLAG_ASSUME_ATA |
>> +       ATA_LFLAG_WAIT_SRST;
>>   /* port 5 is for SEMB device and it doesn't like SRST */
>>   if (link->pmp == 5)
>>   link->flags |= ATA_LFLAG_NO_SRST |
>> diff --git a/include/linux/libata.h b/include/linux/libata.h
>> index b2f2003..3a18caa 100644
>> --- a/include/linux/libata.h
>> +++ b/include/linux/libata.h
>> @@ -172,6 +172,7 @@ enum {
>>   ATA_LFLAG_NO_RETRY = (1 << 5), /* don't retry this link */
>>   ATA_LFLAG_DISABLED = (1 << 6), /* link is disabled */
>>   ATA_LFLAG_SW_ACTIVITY = (1 << 7), /* keep activity stats */
>> + ATA_LFLAG_WAIT_SRST = (1 << 8), /* add delay when SRST fails */
>>
>>   /* struct ata_port flags */
>>   ATA_FLAG_SLAVE_POSS = (1 << 0), /* host supports slave dev */
>>
>>
>> On Fri, Sep 30, 2011 at 2:54 PM, Mike I <mihrcke@xxxxxxxxx> wrote:
>> >
>> > tj <at> kernel.org <tj <at> kernel.org> writes:
>> >
>> > >
>> > > Hello,
>> > >
>> > > How did that go?
>> > >
>> > > Thanks.
>> > >
>> >
>> > Like Derry who started this thread, I too had seen an old thread from
>> > October/November 2008 with what appeared to be no resolution to this problem.
>> > Now, finding this thread, again, with no apparent resolution to this problem.
>> >
>> > I'm currently running Ubuntu 10.04 (lucid), kernel 2.6.32-33-generic.  I've no
>> > experience with applying these git patches, and my searching to figure out how
>> > it works have not helped.
>> >
>> > I'm using an Addonics eSATA PCI-X controller with the SiI3124 chipset, and I
>> > have an Addonics PM in an external enclosure, with a 5 bay/tray DAS.  Some of
>> > my drives give me this problem: (this occurs for me with pretty much ALL
>> > Samsung hard drives)
>> > [12888.470308] ata9.01: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xf
>> > [12888.470313] ata9.01: SError: { PHYRdyChg CommWake DevExch }
>> > [12888.470385] ata9.01: hard resetting link
>> > [12889.211597] ata9.01: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
>> > [12889.211686] ata9.01: failed to IDENTIFY (I/O error, err_mask=0x11)
>> > [12889.211692] ata9.15: hard resetting link
>> > [12891.430086] ata9.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>> > [12891.430397] ata9.00: hard resetting link
>> > [12891.780786] ata9.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
>> > [12894.211103] ata9.01: hard resetting link
>> > [12894.560424] ata9.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12894.560466] ata9.02: hard resetting link
>> > [12894.914176] ata9.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12894.914222] ata9.03: hard resetting link
>> > [12895.264141] ata9.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12895.264169] ata9.04: hard resetting link
>> > [12895.612930] ata9.04: SATA link down (SStatus 0 SControl 320)
>> > [12895.613007] ata9.05: hard resetting link
>> > [12895.964143] ata9.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
>> > [12896.065908] ata9.00: configured for UDMA/100
>> > [12896.065970] ata9.01: failed to IDENTIFY (I/O error, err_mask=0x11)
>> > [12896.065977] ata9.15: hard resetting link
>> > [12898.283804] ata9.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>> > [12898.284128] ata9.00: hard resetting link
>> > [12898.634174] ata9.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
>> > [12899.562524] ata9.01: hard resetting link
>> > [12899.914147] ata9.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12899.914180] ata9.02: hard resetting link
>> > [12900.261682] ata9.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12900.261724] ata9.03: hard resetting link
>> > [12900.610413] ata9.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12900.961283] ata9.05: hard resetting link
>> > [12901.310385] ata9.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
>> > [12901.397241] ata9.00: configured for UDMA/100
>> > [12901.397300] ata9.01: failed to IDENTIFY (I/O error, err_mask=0x11)
>> > [12901.397305] ata9.01: failed to recover link after 3 tries, disabling
>> > [12901.397311] ata9.15: hard resetting link
>> > [12903.613694] ata9.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>> > [12903.960564] ata9.00: hard resetting link
>> > [12904.311125] ata9.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
>> > [12905.260154] ata9.02: hard resetting link
>> > [12905.602929] ata9.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12905.611319] ata9.03: hard resetting link
>> > [12905.962555] ata9.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [12905.962592] ata9.04: hard resetting link
>> > [12906.312931] ata9.04: SATA link down (SStatus 0 SControl 320)
>> > [12906.313004] ata9.05: hard resetting link
>> > [12906.660409] ata9.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
>> > [12906.753619] ata9.00: configured for UDMA/100
>> > [12906.766586] ata9.02: configured for UDMA/100
>> > [12906.771917] ata9.03: configured for UDMA/100
>> > [12907.121462] ata9: EH complete
>> >
>> > If I hot plug the same drive using a port directly off my mobo(no PM in the
>> > mix), I get this result(drive connects/mounts/works):
>> > [27997.097104] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action
>> > 0xe frozen
>> > [27997.097108] ata5: irq_stat 0x00400040, connection status changed
>> > [27997.097111] ata5: SError: { PHYRdyChg CommWake DevExch }
>> > [27997.097116] ata5: hard resetting link
>> > [28007.147622] ata5: softreset failed (device not ready)
>> > [28007.147627] ata5: hard resetting link
>> > [28010.630028] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> > [28010.748595] ata5.00: ATA-7: SAMSUNG HD154UI, 1AG01118, max UDMA7
>> > [28010.748599] ata5.00: 2930277168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>> > [28010.755227] ata5.00: configured for UDMA/133
>> > [28010.755237] ata5: EH complete
>> > [28010.756338] scsi 4:0:0:0: Direct-Access     ATA      SAMSUNG HD154UI  1AG0
>> > PQ: 0 ANSI: 5
>> > [28010.756475] sd 4:0:0:0: Attached scsi generic sg10 type 0
>> > [28010.756572] sd 4:0:0:0: [sdj] 2930277168 512-byte logical blocks: (1.50
>> > TB/1.36 TiB)
>> > [28010.756613] sd 4:0:0:0: [sdj] Write Protect is off
>> > [28010.756616] sd 4:0:0:0: [sdj] Mode Sense: 00 3a 00 00
>> > [28010.756636] sd 4:0:0:0: [sdj] Write cache: enabled, read cache: enabled,
>> > doesn't support DPO or FUA
>> > [28010.756760]  sdj: sdj1
>> > [28010.816161] sd 4:0:0:0: [sdj] Attached SCSI disk
>> >
>> > I've been using Ubuntu for a few years now, and have been living with the
>> > problem...working around it with USB docking stations and such.  But, I'd
>> > really hope to see/find this problem worked out.
>> >
>> > Thoughts/tips/suggestions?  Since I'm pretty much a novice when it comes to
>> > patching, a link to a guide for git patching would be appreciated too.
>> >
>> > Thank You,
>> > Mike
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux