Re: [PATCH] libata: don't request sense data on !ZAC ATA devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/06/25 1:33, Tejun Heo wrote:
> ZAC support added sense data requesting on error for both ZAC and ATA
> devices. This seems to cause erratic error handling behaviors on some
> SSDs where the device reports sense data availability and then
> delivers the wrong content making EH take the wrong actions.  The
> failure mode was sporadic on a LITE-ON ssd and couldn't be reliably
> reproduced.
> 
> There is no value in requesting sense data from non-ZAC ATA devices
> while there's a significant risk of introducing EH misbehaviors which
> are difficult to reproduce and fix.  Let's do the sense data dancing
> only for ZAC devices.
> 
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Hannes Reinecke <hare@xxxxxxxxxx>
> ---
>  drivers/ata/libata-eh.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 9d687e1d4325..3bfd9da58473 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -1469,7 +1469,7 @@ static int ata_eh_read_log_10h(struct ata_device *dev,
>  	tf->hob_lbah = buf[10];
>  	tf->nsect = buf[12];
>  	tf->hob_nsect = buf[13];
> -	if (ata_id_has_ncq_autosense(dev->id))
> +	if (dev->class == ATA_DEV_ZAC && ata_id_has_ncq_autosense(dev->id))
>  		tf->auxiliary = buf[14] << 16 | buf[15] << 8 | buf[16];
>  
>  	return 0;
> @@ -1716,7 +1716,8 @@ void ata_eh_analyze_ncq_error(struct ata_link *link)
>  	memcpy(&qc->result_tf, &tf, sizeof(tf));
>  	qc->result_tf.flags = ATA_TFLAG_ISADDR | ATA_TFLAG_LBA | ATA_TFLAG_LBA48;
>  	qc->err_mask |= AC_ERR_DEV | AC_ERR_NCQ;
> -	if ((qc->result_tf.command & ATA_SENSE) || qc->result_tf.auxiliary) {
> +	if (dev->class == ATA_DEV_ZAC &&
> +	    ((qc->result_tf.command & ATA_SENSE) || qc->result_tf.auxiliary)) {
>  		char sense_key, asc, ascq;
>  
>  		sense_key = (qc->result_tf.auxiliary >> 16) & 0xff;
> @@ -1770,10 +1771,11 @@ static unsigned int ata_eh_analyze_tf(struct ata_queued_cmd *qc,
>  	}
>  
>  	switch (qc->dev->class) {
> -	case ATA_DEV_ATA:
>  	case ATA_DEV_ZAC:
>  		if (stat & ATA_SENSE)
>  			ata_eh_request_sense(qc, qc->scsicmd);
> +		/* fall through */
> +	case ATA_DEV_ATA:
>  		if (err & ATA_ICRC)
>  			qc->err_mask |= AC_ERR_ATA_BUS;
>  		if (err & (ATA_UNC | ATA_AMNF))
> 

For NCQ commands, I believe it is mandatory to request sense data for the failed
command to get the device out of error mode. So isn't this approach breaking
anything for well behaving drives ? Wouldn't it be better to blacklist the
misbehaving SSD you observed the problem with ?


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux