On 9/22/22 00:57, Niklas Cassel wrote: > Performing a revalidation on a AHCI controller supporting LPM, > while using a lpm mode of e.g. med_power_with_dip (hipm + dipm) or > medium_power (hipm), will currently always lead to a hard reset. > > The expected behavior is that a hard reset is only performed when > revalidate fails, because the properties of the drive has changed. > > A revalidate performed after e.g. a NCQ error, or such a simple thing > as disabling write-caching (hdparm -W 0 /dev/sda), should succeed on > the first try (and should therefore not cause the link to be reset). > > This unwarranted hard reset happens because ata_phys_link_offline() > returns true for a link that is in deep sleep. Thus the call to > ata_phys_link_offline() in ata_eh_revalidate_and_attach() will cause > the revalidation to fail, which causes ata_eh_handle_dev_fail() to be > called, which will set ehc->i.action |= ATA_EH_RESET, such that the > link is reset before retrying revalidation. > > When the link is reset, the link is reestablished, so when > ata_eh_revalidate_and_attach() is called the second time, directly > after the link has been reset, ata_phys_link_offline() will return > false, and the revalidation will succeed. > > Looking at "8.3.1.3 HBA Initiated" in the AHCI 1.3.1 specification, > it is clear the when host software writes a new command to memory, > by setting a bit in the PxCI/PxSACT HBA port registers, the HBA will > automatically bring back the link before sending out the Command FIS. > > However, simply reading a SCR (like ata_phys_link_offline() does), > will not cause the HBA to automatically bring back the link. > > As long as hipm is enabled, the HBA will put an idle link into deep > sleep. Avoid this needless hard reset on revalidation by temporarily > disabling hipm, by setting the LPM mode to ATA_LPM_MAX_POWER. > > After revalidation is complete, ata_eh_recover() will restore the link > policy by setting the LPM mode to ap->target_lpm_policy. > > Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx> Applied to for-6.1. Thanks ! > --- > drivers/ata/libata-eh.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c > index 7c128c89b454..1f83ae8690ee 100644 > --- a/drivers/ata/libata-eh.c > +++ b/drivers/ata/libata-eh.c > @@ -151,6 +151,8 @@ ata_eh_cmd_timeout_table[ATA_EH_CMD_TIMEOUT_TABLE_SIZE] = { > #undef CMDS > > static void __ata_port_freeze(struct ata_port *ap); > +static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy, > + struct ata_device **r_failed_dev); > #ifdef CONFIG_PM > static void ata_eh_handle_port_suspend(struct ata_port *ap); > static void ata_eh_handle_port_resume(struct ata_port *ap); > @@ -2940,6 +2942,23 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link, > if ((action & ATA_EH_REVALIDATE) && ata_dev_enabled(dev)) { > WARN_ON(dev->class == ATA_DEV_PMP); > > + /* > + * The link may be in a deep sleep, wake it up. > + * > + * If the link is in deep sleep, ata_phys_link_offline() > + * will return true, causing the revalidation to fail, > + * which leads to a (potentially) needless hard reset. > + * > + * ata_eh_recover() will later restore the link policy > + * to ap->target_lpm_policy after revalidation is done. > + */ > + if (link->lpm_policy > ATA_LPM_MAX_POWER) { > + rc = ata_eh_set_lpm(link, ATA_LPM_MAX_POWER, > + r_failed_dev); > + if (rc) > + goto err; > + } > + > if (ata_phys_link_offline(ata_dev_phys_link(dev))) { > rc = -EIO; > goto err; -- Damien Le Moal Western Digital Research