Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 30, 2007 at 04:32:34PM +0900, Tejun Heo wrote:
> Hello, Gary.
> 
> Gary Hade wrote:
> >>> If they verify your fix (ie,
> >>> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> >>> after SRST), I'll push similar patch upstream.
> >> Thanks.  If you think that changes to increase the delays are
> >> the way to go (at least until we can find a better solution)
> >> I can provide patches.
> > 
> > Tejun, 
> > I haven't heard anything from you on this so I'm including a delay
> > increase patch against 2.6.20-rc6 for the 'ata-piix' case below.  
> > I hope that you, Jeff, and others find this acceptable.
> 
> Sorry about being unresponsive.  The thing is that the change adds
> unnecessary 2 secs of delay to a lot of other normal device-not-present
> cases, so I was hesitant to ack the patch.  I'll give it more thoughts
> (and respond timely this time :-)

Thanks!  My followup was untimely so we're even. :-)

Some of my random thoughts:
There does appear to be this invalid assumption that 0xFF status 
always implies device-not-present.  The status register access 
restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The 
contents of this register, except for BSY, shall be ignored when 
BSY is set to one." which the code does not honor.  There is apparently 
past experience that 0xFF status implies device-not-present for some
controllers (the odd clowns :) but I have no idea how common these are.
We obviously can't get rid of the check but since we cannot clear
the read-only status register and there appears to be no specification 
dictated upper limit on how long it should take for a software reset to 
complete it just seems like we need to wait long enough to support the 
slowest known device which may be the GoVault.

> 
> > With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is 
> > useable following boot although the below messages are being logged 
> > during initialization.  Please let me know if you have any thoughts 
> > on this.  
> >   scsi1 : ahci
> >   ata2: softreset failed (port busy but CLO unavailable)
> >   ata2: softreset failed, retrying in 5 secs
> >   ata2: port is slow to respond, please be patient (Status 0x80)
> >   ata2: port failed to respond (30 secs, Status 0x80)
> >   ata2: COMRESET failed (device not ready)
> >   ata2: hardreset failed, retrying in 5 secs
> >   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >   ata2.00: ATAPI, max UDMA/66
> >   ata2.00: configured for UDMA/66
> 
> The above should have been fixed in 2.6.20-rc6.  Please test it.  It was
> caused by the ahci driver incorrectly clearing ahci CAP register and
> fixed recently.

I'm clearly seeing this with 2.6.20-rc6 but unlike the ata-piix
issue it does not appear to be dependent on the port to which the
device is attached.  I've been playing around with this today and
found that it could be solved by inserting a delay between the 
ahci_stop_engine() call and BSY/DRQ check.

This change:
--- linux-2.6.20-rc6/drivers/ata/ahci.c.orig	2007-01-30 11:01:20.000000000 -0800
+++ linux-2.6.20-rc6/drivers/ata/ahci.c	2007-01-30 12:59:38.000000000 -0800
@@ -804,6 +804,19 @@ static int ahci_softreset(struct ata_por
 		goto fail_restart;
 	}
 
+	{
+		int delay;
+		u8 stat;
+		for (delay = 0; delay < 2000; delay+=100) {
+			if (!(ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)))
+				break;
+			msleep(100);
+			stat = ahci_check_status(ap);
+			ata_port_printk(ap, KERN_INFO, "delay=%d BSY=%d DRQ=%d\n",
+				delay, (stat & ATA_BUSY)?1:0, (stat & ATA_DRQ)?1:0);
+		}
+	}
+
 	/* check BUSY/DRQ, perform Command List Override if necessary */
 	if (ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)) {
 		rc = ahci_clo(ap);

Yielded this output both with and without the RDC inserted:
scsi1 : ahci
ata2: delay=0 BSY=1 DRQ=0
ata2: delay=100 BSY=1 DRQ=0
ata2: delay=200 BSY=1 DRQ=0
ata2: delay=300 BSY=1 DRQ=0
ata2: delay=400 BSY=1 DRQ=0
ata2: delay=500 BSY=1 DRQ=0
ata2: delay=600 BSY=1 DRQ=0
ata2: delay=700 BSY=1 DRQ=0
ata2: delay=800 BSY=1 DRQ=0
ata2: delay=900 BSY=1 DRQ=0
ata2: delay=1000 BSY=1 DRQ=0
ata2: delay=1100 BSY=1 DRQ=0
ata2: delay=1200 BSY=0 DRQ=0
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATAPI, max UDMA/66
ata2.00: configured for UDMA/66

So it appears that we may also have a similar device slowness issue 
with this driver.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@xxxxxxxxxx
http://www.ibm.com/linux/ltc

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux