Hi, Thank you for reporting this. On 23-04-18 03:03, Kevin Shanahan wrote:
Hi, After upgrading kernel from 4.15.9-1 to 4.16.3-1 (Arch Linux) my router started responding very slowly. These message were repeatedly showing up in the logs: Apr 23 10:21:43 link kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x50000 action 0x6 frozen Apr 23 10:21:43 link kernel: ata1: SError: { PHYRdyChg CommWake } Apr 23 10:21:43 link kernel: ata1.00: failed command: WRITE DMA Apr 23 10:21:43 link kernel: ata1.00: cmd ca/00:08:60:5d:cd/00:00:00:00:00/e1 tag 9 dma 4096 out res 50/01:01:01:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 23 10:21:43 link kernel: ata1.00: status: { DRDY } Apr 23 10:21:43 link kernel: ata1.00: error: { AMNF } Apr 23 10:21:43 link kernel: ata1: hard resetting link Apr 23 10:21:43 link kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Apr 23 10:21:43 link kernel: ata1.00: configured for UDMA/133 Apr 23 10:21:43 link kernel: ata1: EH complete I noticed that the SATA LPM states had now been enabled, so tried changing from 'med_power_with_dipm' to 'medium_power' and the problem went away: echo medium_power > /sys/class/scsi_host/host0/link_power_management_policy Perhaps there is something about my combination of controller/drive that is not compatible?
I guess so I'm somewhat surprised about this because Samsung SSDs tend to be well behaved, but this is a msata version, which may have some firmware differences to the regular 2.5" models I guess and the PM830 SSD has many OEM firmware customizations. So based on that I think a narrow quirk targeting your specific firmware version is the best solution for this for now. I've attached a patch for this, can you build an arch kernel with that patch added and see if that fixes things without you needing to manually change anything? /sys/class/scsi_host/host0/link_power_management_policy should now default to max_performance for your SSD. Note that there are almost no powersavings when going from maximum_performance to medium_power, so we simply disable LPM all together on models which have issues with med_power_with_dipm. May I ask what motherboard your router is using ? Regards, Hans
# hdparm -i /dev/sda /dev/sda: Model=SAMSUNG MZMPC128HBFU-000MV, FwRev=CXM14M1Q, SerialNo=S19FNYAD394414 Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=250069680 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=no WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-2,3,4,5,6,7 # cat /proc/cpuinfo | grep "model name" model name : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz model name : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz model name : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz model name : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz # cat /sys/class/scsi_device/0\:0\:0\:0/device/model SAMSUNG MZMPC128 Regards, Kevin Shanahan.
>From 7f63aa54bf722b0585c5521d4728279d3d8fa40f Mon Sep 17 00:00:00 2001 From: Hans de Goede <hdegoede@xxxxxxxxxx> Date: Mon, 23 Apr 2018 09:27:28 +0200 Subject: [PATCH] libata: Apply NOLPM quirk for SAMSUNG MZMPC128HBFU-000MV SSD Kevin Shanahan reports the following repeating errors when using LPM, causing long delays accessing the disk: Apr 23 10:21:43 link kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x50000 action 0x6 frozen Apr 23 10:21:43 link kernel: ata1: SError: { PHYRdyChg CommWake } Apr 23 10:21:43 link kernel: ata1.00: failed command: WRITE DMA Apr 23 10:21:43 link kernel: ata1.00: cmd ca/00:08:60:5d:cd/00:00:00:00:00/e1 tag 9 dma 4096 out res 50/01:01:01:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 23 10:21:43 link kernel: ata1.00: status: { DRDY } Apr 23 10:21:43 link kernel: ata1.00: error: { AMNF } Apr 23 10:21:43 link kernel: ata1: hard resetting link Apr 23 10:21:43 link kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Apr 23 10:21:43 link kernel: ata1.00: configured for UDMA/133 Apr 23 10:21:43 link kernel: ata1: EH complete These go away when switching from med_power_with_dipm to medium_power. This is somewhat weird as the PM830 datasheet explicitly mentions DIPM being supported and the idle power-consumption is specified with DIPM enabled. There are many OEM customized firmware versions for the PM830, so for now lets assume this is firmware version specific and blacklist LPM based on the firmware version. Cc: Kevin Shanahan <kevin@xxxxxxxxxxxxxx> Reported-by: Kevin Shanahan <kevin@xxxxxxxxxxxxxx> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> --- drivers/ata/libata-core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 8bc71ca61e7f..6e400ff2b5db 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4549,6 +4549,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { ATA_HORKAGE_ZERO_AFTER_TRIM | ATA_HORKAGE_NOLPM, }, + /* This specific Samsung model/firmware-rev does not handle LPM well */ + { "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM, }, + /* devices that don't properly handle queued TRIM commands */ { "Micron_M500_*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM, }, -- 2.17.0