On 03 February, 2022 - Damien Le Moal wrote: > On 2/3/22 17:32, Anton Lundin wrote: > > On 03 February, 2022 - Damien Le Moal wrote: > > > >> On 2022/02/02 22:54, Anton Lundin wrote: > >>> On 02 February, 2022 - Anton Lundin wrote: > >>> > >>>> On 02 February, 2022 - Damien Le Moal wrote: > >>>> > >>>>> On 2022/02/02 21:25, Anton Lundin wrote: > >>>>>> On 02 February, 2022 - Damien Le Moal wrote: > >>>>>> > >>>>>>> On 2/2/22 19:05, Anton Lundin wrote: > >>>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls") > >>>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the > >>>>>>>> SATADOM-ML 3ME to lock up. > >>>>>>>> > >>>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log") > >>>>>>>> a flag was added to cache if a device supports this or not. > >>>>>>>> > >>>>>>>> This adds a blacklist entry which flags that these devices doesn't > >>>>>>>> support that call and shouldn't be issued that call. > >>>>>>>> > >>>>>>>> Cc: stable@xxxxxxxxxxxxxxx # v5.10+ > >>>>>>>> Signed-off-by: Anton Lundin <glance@xxxxxxxxxx> > >>>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log") > >>>>>>> > >>>>>>> I do not think so. See below. > >>>>>>> > >>>>>>>> --- > >>>>>>>> drivers/ata/libata-core.c | 7 +++++++ > >>>>>>>> 1 file changed, 7 insertions(+) > >>>>>>>> > >>>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > >>>>>>>> index 87d36b29ca5f..e024af9f33d0 100644 > >>>>>>>> --- a/drivers/ata/libata-core.c > >>>>>>>> +++ b/drivers/ata/libata-core.c > >>>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { > >>>>>>>> { "WDC WD3000JD-*", NULL, ATA_HORKAGE_WD_BROKEN_LPM }, > >>>>>>>> { "WDC WD3200JD-*", NULL, ATA_HORKAGE_WD_BROKEN_LPM }, > >>>>>>>> > >>>>>>>> + /* > >>>>>>>> + * This sata dom goes on a walkabout when it sees the > >>>>>>>> + * ATA_LOG_DIRECTORY read request so ensure we don't issue such a > >>>>>>>> + * request to these devices. > >>>>>>>> + */ > >>>>>>>> + { "SATADOM-ML 3ME", NULL, ATA_HORKAGE_NO_ID_DEV_LOG }, > >>>>>>> > >>>>>>> This flag only disables trying to access the identify device log page, > >>>>>>> it does *not* avoid access to the log directory log page in general. The > >>>>>>> log directory will still be consulted for other log pages beside the > >>>>>>> identify device log page, from any function that calls > >>>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and > >>>>>>> ata_dev_config_ncq_non_data()) > >>>>>> > >>>>>> Non of those code paths are called for this device, probably due to some > >>>>>> other flag disqualifying them. > >>>>>> > >>>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag > >>>>>>> and test for it in ata_log_supported(), completely preventing any access > >>>>>>> to the log directory page for this drive type. > >>>>>> > >>>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG > >>>>>> which was in the calling path that actually triggered this issue. > >>>>>> > >>>>>> But, yes, I totally agree that's a more solid solution preventing this > >>>>>> kind of issue to crop up again. > >>>>>> > >>>>>>>> + > >>>>>>>> /* End Marker */ > >>>>>>>> { } > >>>>>>>> }; > >>>>>>> > >>>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..." > >>>>>>> and a Fixes tag. > >>>>>> > >>>>>> I'd think it's fitting to send it to linux-stable, because it prevents > >>>>>> those DOM's from working in v5.15.5+. > >>>>>> > >>>>>> Ok. I must have missed that part when I read submitting-patches doc. > >>>>>> > >>>>>> I'll rework and re-submit the patch. > >>>>> > >>>>> I sent you a draft patch. Please try it. > >>>> > >>>> Works like a charm. > >>>> > >>>>> Also, to confirm if the log directory log page is indeed the page that locks up > >>>>> the drive, can you try this command: > >>>>> > >>>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly > >>>> > >>>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda > >>>> ATA PASS-THROUGH (16), bad field in cdb > >>>> sg_sat_read_gplog failed: Illegal request > >>>> > >>>>> and > >>>>> > >>>>> sg_sat_read_gplog --log=0 --page=0 --readonly > >>>> > >>>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda > >>>> 00 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 08 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 10 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 18 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 20 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 28 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 30 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 38 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 40 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 48 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 50 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 58 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 60 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 68 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 70 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 78 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 80 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 88 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 90 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> 98 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> a0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> a8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> b0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> b8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> c0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> c8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> d0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> d8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> e0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> e8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> f0 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> f8 0000 0000 0000 0000 0000 0000 0000 0000 .. .. .. .. .. .. .. .. > >>>> > >>>> Mind you, this with a patched kernel, if that affects anything. > >>> > >>> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command > >>> hangs for a while: > >>> > >>> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda > >>> ATA PASS-THROUGH (16), bad field in cdb > >>> sg_sat_read_gplog failed: Illegal request > >>> > >>> real 0m28.337s > >>> user 0m0.000s > >>> sys 0m0.001s > >>> > >>> and logs the following in the kernel log: > >>> > >>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > >>> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) > >>> ata1.00: revalidation failed (errno=-5) > >>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > >>> ata1.00: configured for UDMA/133 > >>> > >>> > >>> But after that 30s walkabout and whatever the kernel does the device > >>> starts functioning again. > >> > >> The 30s "hang" is the default command timeout: your drive is not responding to > >> the DMA version of READ LOG EXT command. There are some drives out there like > >> that. So instead of completely disabling access to the log directory, we should > >> simply force the use of READ LOG EXT. And for that, there is the horkage flag > >> ATA_HORKAGE_NO_DMA_LOG. > >> > >> Can you try using that one without the patch I sent ? > > > > I think I tested that before submitting the patch, but re-tested it now > > and that doesn't work: > > Your original patch used the ATA_HORKAGE_NO_ID_DEV_LOG flag. What > ATA_HORKAGE_NO_DMA_LOG does is prevent the use of the READ LOG DMA EXT > command. READ LOG EXT command (PIO) will be used. And the sg commands I > sent you and that you tried show that it works: > > sg_sat_read_gplog --log=0 --page=0 --readonly > > worked perfectly, while: > > sg_sat_read_gplog --dma --log=0 --page=0 --readonly > > failed. > > So by simply preventing the use of READ LOG DMA EXT for your drive, the > kernel will be able to safely get the log directory and test for other > page support. > > In your original patch, try replacing ATA_HORKAGE_NO_ID_DEV_LOG with > ATA_HORKAGE_NO_DMA_LOG. I could have bin clearer. This log snippet below is a 5.15.10 with ATA_HORKAGE_NO_DMA_LOG set for this device. Shure, sg_sat_read_gplog --log=0 --page=0 --readonly worked when booted, but as it looks to me, it doesn't work during boot. //Anton - It's a mystery wrapped in a enigma > > > > This is on a 5.15 kernel: > > > > ata1.00: qc timeout (cmd 0x2f) > > ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4 > > ata1.00: ATA Identify Device Log not supported > > ata1.00: failed to set xfermode (err_mask=0x40) > > ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > > ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) > > ata1.00: revalidation failed (errno=-5) > > ata1: limiting SATA link speed to 3.0 Gbps > > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320) > > ata1.00: qc timeout (cmd 0x2f) > > ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4 > > ata1.00: ATA Identify Device Log not supported > > ata1.00: failed to set xfermode (err_mask=0x40) > > ata1.00: disabled > > > > On a 5.16+ with the "libata: add horkage for missing Identify Device > > log" patch, the machine will boot because it won't re-try to access the > > ata log directory after the reset to 3.0 Gbps sata. > > > > > > //Anton > > > -- > Damien Le Moal > Western Digital Research -- Anton Lundin +46702-161604