Hi Niklas, Eric, On 11-Mar-25 3:14 PM, Niklas Cassel wrote: > Hello Hans, Eric, > > On Mon, Mar 10, 2025 at 09:12:13PM +0100, Hans de Goede wrote: >> >> I agree with you that this is a BIOS bug of the motherboard in question >> and/or a bad interaction between the ATI SATA controller and Samsung SSD >> 870* models. Note that given the age of the motherboard there are likely >> not going to be any BIOS updates fixing this though. > > Looking at the number of quirks for some of the ATI SB7x0/SB8x0/SB9x0 SATA > controllers, they really look like something special (not in a good way): > https://github.com/torvalds/linux/blob/v6.14-rc6/drivers/ata/ahci.c#L236-L244 > > -Ignore SError internal > -No MSI > -Max 255 sectors > -Broken 64-bit DMA > -Retry SRST (software reset) > > And that is even without the weird "disable NCQ but only for Samsung SSD > 8xx drives" quirk when using these ATI controllers. > > > What does bother me is that we don't know if it is this specific mobo/BIOS: > Manufacturer: ASUSTeK COMPUTER INC. > Product Name: M5A99X EVO R2.0 > Version: Rev 1.xx > > M5A99X EVO R2.0 BIOS 2501 > Version 2501 > 3.06 MB > 2014/05/14 > > > that should have a NOLPM quirk, like we do for specific BIOSes: > https://github.com/torvalds/linux/blob/v6.14-rc6/drivers/ata/ahci.c#L1402-L1439 That seems to be a Lenovo only thing though and with Intel chipsets. > Or if it this ATI SATA controller that is always broken when it comes > to LPM, regardless of the drive, or if it is only Samsung drives. I'm pretty sure we can assume this will happen on all ATI SATA controllers, the new LPM default is pretty recent and these boards are getting old, so likely have not that many users who use distros which ship cutting edge kernels. I do agree with you that it is a question if this is another bad interaction with Samsung SATA SSDs, or if it is a general ATI SATA controller problem, but see below. > Considering the dmesg comparing cold boot, the Maxtor drive and the > ASUS ATAPI device seems to be recognized correctly. > > Eric, could you please run: > $ sudo hdparm -I /dev/sdX | grep "interface power management" > > on both your Samsung and Maxtor drive? > (A star to the left of feature means that the feature is enabled) > > > > One guess... perhaps it could be Device Initiated PM that is broken with > these controllers? (Even though the controller does claim to support it.) > > Eric, could you please try this patch: > > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c > index f813dbdc2346..ca690fde8842 100644 > --- a/drivers/ata/ahci.c > +++ b/drivers/ata/ahci.c > @@ -244,7 +244,7 @@ static const struct ata_port_info ahci_port_info[] = { > }, > [board_ahci_sb700] = { /* for SB700 and SB800 */ > AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL), > - .flags = AHCI_FLAG_COMMON, > + .flags = AHCI_FLAG_COMMON | ATA_FLAG_NO_DIPM, > .pio_mask = ATA_PIO4, > .udma_mask = ATA_UDMA6, > .port_ops = &ahci_pmp_retry_srst_ops, > > > > Normally, I do think that we need more reports, to see if it is just > this specific BIOS, or all the ATI SB7x0/SB8x0/SB9x0 SATA controllers > that are broken... > > ...but, considering how many quirks these ATI controllers have already... Right in the mean time Eric has reported back that the above patch fixes this. Thank you for testing this Eric, One reason why ATA_QUIRK_NO_NCQ_ON_ATI was introduced is because disabling NCQ has severe performance impacts for SSDs, so we did not want to do this for all ATI controllers; or for all Samsung drives. Given that until the recent LPM default change we did not use DIPM on ATI chipsets the above fix IMHO is a good fix, which even keeps the rest of the LPM power-savings. > ...and the fact that the one (Dieter) who reported that his Samsung SSD 870 > QVO could enter deeper sleep states just fine was running an Intel AHCI > controller (with the same FW version as Eric), I would be open to a patch > that sets ATA_FLAG_NO_LPM for all these ATI controllers. Right I think it is save to assume that this is not a Samsung drive problem it is an ATI controller problem. The only question is if this only impacts ATI <-> Samsung SSD combinations or if it is a general issue with ATI controllers. But given the combination of DIPM not having been enabled on these controllers by default anyways, combined with the age of these motherboards (*) I believe that the above patch is a good compromise to fix the regression without needing to wait for more data. Regards, Hans *) And there thus being less users making getting more data hard. And alo meaning not having DIPM will impact only the relatively few remaining users