So it is reasonable to add extra check in ahci_start_engine() without returning status of ST bit. If so, here is my patch --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 +++ libahci.c 2010-12-08 10:45:17.495156944 -0800 @@ -542,6 +542,13 @@ { void __iomem *port_mmio = ahci_port_base(ap); u32 tmp; + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; + + /* avoid race condition per spec (end of section 10.1.2) */ + if (status & (ATA_BUSY | ATA_DRQ) || + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || + (tmp & 0x0f) != 0x03) + return; /* start DMA */ tmp = readl(port_mmio + PORT_CMD); Thanks, Jian ________________________________________ From: Tejun Heo [tj@xxxxxxxxxx] Sent: Wednesday, December 08, 2010 10:24 AM To: Jian Peng Cc: Robert Hancock; linux-kernel@xxxxxxxxxxxxxxx; jgarzik@xxxxxxxxx; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On 12/08/2010 07:14 PM, Jian Peng wrote: > The problem happened as follow: > > After power up, inside ahci_init_one(), it will call ahci_power_up() > to toggle PxCMD.SUD bit first, then HBA will send COMRESET to > device, and device will send first D2H FIS back. Here it will call > ahci_start_engine() to turn on PxCMD.ST to process command. In this > case, it may run into race condition that transaction triggered by > toggling PxCMD.SUD is not completed yet, and that is the reason why > extra check is required by spec to guarantee that HBA already > received FIS and in sane state. > > In most HBA, either staggered spin-up feature was not supported, or > time required for transaction is less than that between two function > calls, it may work. IMHO, this is a clear violation of spec, and not > robust against all HBA design. > > The major concern is that ahci_start_engine() is used widely in EH > and it does not return result to reflect whether ST bit was set or > not, this may cause trouble in some cases. I am working on verifying > those cases with different HBAs now. I see, so it's not that the controller actually failed but you observed the race condition. During EH, ahci_start_engine() is used rather liberally when the driver wants the controller in working condition. The assumption is that even if the driver tries to set ST with the incorrect condition, the controller wouldn't go crazy where it can't be restarted later, which _must_ be true as there's no atomic way to check the condition and set ST. The driver at the same time guarantees that if the whole initialization protocol succeeds, the last ahci_start_engine() is called after hardreset is sucessfully completed and thus all the prerequisites are met. So, yeap, the driver might set ST when the conditions are not met but that can't be avoided completely anyway so the controller shouldn't go completely bonkers for that (it's okay to fail in recoverable way) and the driver will do the last ST setting after all the conditions are met on the success path. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html