Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02 February, 2022 - Anton Lundin wrote:

> On 02 February, 2022 - Damien Le Moal wrote:
> 
> > On 2022/02/02 21:25, Anton Lundin wrote:
> > > On 02 February, 2022 - Damien Le Moal wrote:
> > > 
> > >> On 2/2/22 19:05, Anton Lundin wrote:
> > >>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> > >>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> > >>> SATADOM-ML 3ME to lock up.
> > >>>
> > >>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > >>> a flag was added to cache if a device supports this or not.
> > >>>
> > >>> This adds a blacklist entry which flags that these devices doesn't
> > >>> support that call and shouldn't be issued that call.
> > >>>
> > >>> Cc: stable@xxxxxxxxxxxxxxx # v5.10+
> > >>> Signed-off-by: Anton Lundin <glance@xxxxxxxxxx>
> > >>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > >>
> > >> I do not think so. See below.
> > >>
> > >>> ---
> > >>>  drivers/ata/libata-core.c | 7 +++++++
> > >>>  1 file changed, 7 insertions(+)
> > >>>
> > >>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> > >>> index 87d36b29ca5f..e024af9f33d0 100644
> > >>> --- a/drivers/ata/libata-core.c
> > >>> +++ b/drivers/ata/libata-core.c
> > >>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> > >>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> > >>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> > >>>  
> > >>> +	/*
> > >>> +	 * This sata dom goes on a walkabout when it sees the
> > >>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> > >>> +	 * request to these devices.
> > >>> +	 */
> > >>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> > >>
> > >> This flag only disables trying to access the identify device log page,
> > >> it does *not* avoid access to the log directory log page in general. The
> > >> log directory will still be consulted for other log pages beside the
> > >> identify device log page, from any function that calls
> > >> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> > >> ata_dev_config_ncq_non_data())
> > > 
> > > Non of those code paths are called for this device, probably due to some
> > > other flag disqualifying them.
> > >  
> > >> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> > >> and test for it in ata_log_supported(), completely preventing any access
> > >> to the log directory page for this drive type.
> > > 
> > > That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> > > which was in the calling path that actually triggered this issue.
> > > 
> > > But, yes, I totally agree that's a more solid solution preventing this
> > > kind of issue to crop up again.
> > > 
> > >>> +
> > >>>  	/* End Marker */
> > >>>  	{ }
> > >>>  };
> > >>
> > >> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> > >> and a Fixes tag.
> > > 
> > > I'd think it's fitting to send it to linux-stable, because it prevents
> > > those DOM's from working in v5.15.5+.
> > > 
> > > Ok. I must have missed that part when I read submitting-patches doc.
> > > 
> > > I'll rework and re-submit the patch.
> > 
> > I sent you a draft patch. Please try it.
> 
> Works like a charm.
> 
> > Also, to confirm if the log directory log page is indeed the page that locks up
> > the drive, can you try this command:
> > 
> > sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
> > and
> > 
> > sg_sat_read_gplog --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> 
> Mind you, this with a patched kernel, if that affects anything.

Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
hangs for a while:

# time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
ATA PASS-THROUGH (16), bad field in cdb
sg_sat_read_gplog failed: Illegal request

real	0m28.337s
user	0m0.000s
sys	0m0.001s

and logs the following in the kernel log:

ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
ata1.00: revalidation failed (errno=-5)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: configured for UDMA/133


But after that 30s walkabout and whatever the kernel does the device
starts functioning again.

//Anton

  
> > Beware that if the log drectory is the problem, this will lockup your drive !
> > If that is OK and works, then please try:
> > 
> > sg_sat_read_gplog --dma --log=48 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
> > and
> > 
> > sg_sat_read_gplog --log=48 --page=0 --readonly
> 
> sg_sat_read_gplog --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
>  
> > log=48 (0x30) is the identify device log page. log=0 is the log directory log page.
>  
> So, does this means that it might be where in the init of the device the
> directory log page is accessed that causes the device to fault?
> 
> A snippet from a failed boot before the patch looks like:
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> ata1.00: revalidation failed (errno=-5)
> ata1: limiting SATA link speed to 3.0 Gbps
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1.00: disabled
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> 
> 
> When I messed around trying to figure out what the fault was I saw a
> couple of different Emask, 0x4, 0x40 and 0x80.
> 
> I also blindly played around with adding a ata_msleep(ap, 500); after
> the failed attempt to access the log directory log page, but that didn't
> help.
> 
> //Anton

-- 
Anton Lundin	+46702-161604



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux