Re: libata new EH issue - ATAPI device error not properly reported

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tejun Heo wrote:
> Hello, Albert.
> 
> Albert Lee wrote:
> 
>> Hi Tejun,
>>
>> Unicorn is doing some test with the current upstream branch and something
>> looks strange in the log.
>>
>> With the new EH, when the ATAPI device reports dev_status 0x51,
>> the err_mask is reported as 0x0. This does not look right.
>> Maybe we should report AC_ERR_DEV error back to the upper layers?
>> (Test log attached for your review.)
>>
> [--snip--]
> 
>> Jun 27 18:10:57 xlinux19 kernel: atapi_qc_complete: ENTER, err_mask
>> 0x0  <== Doesn't look right. Should report error.
> 
> 
> The following code block follows the above VPRINTK.
> 
>     /* handle completion from new EH */
>     if (unlikely(qc->ap->ops->error_handler &&
>              (err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID))){
> 
>         if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
>             /* FIXME: not quite right; we don't want the
>              * translation of taskfile registers into a
>              * sense descriptors, since that's only
>              * correct for ATA, not ATAPI
>              */
>             ata_gen_ata_desc_sense(qc);
>         }
> 
>         qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;
>         qc->scsidone(cmd);
>         ata_qc_free(qc);
>         return;
>     }
> 
> As EH set ATA_QCFLAG_SENSE_VALID after reading sense data the above if
> block is executed thus reporting upper layer CHECK_CONDITION.  Overall,
> the control flow on ATAPI CC is like the following.
> 
> 1. ATAPI CC occurs
> 
> 2. AC_ERR_DEV set and EH invoked
> 
> 3. EH requests sense.  sense is stored in the sense buffer and
> AC_ERR_DEV is cleared.
> 
> 4. EH completes the qc and triggers above code block in
> atapi_qc_complete().
> 
> The logic behind clearing AC_ERR_DEV after reading sense data is that
> ATAPI CC doesn't always indicate device error.  It can indicate
> anything.  By clearing AC_ERR_DEV after sense data is read, libata EH
> considers the error condition is cleared and doesn't perform further
> action on the device.  If the sense data indicates actual error, upper
> layers will probably deal with it (sr driver or userland application).
> 
> It might be useful to interpret some of sense data and handle things
> like transmission error in libata EH, but I don't know.  ATAPI errors
> always have been handled by upper layers.
> 

Thanks for the detailed explanation. We are checking a strange problem
related to the GoVault drive: After the GoVault drive is ejected, a
flood of SCSI commands 0x1e are issued to libata upstream kernel.
(This is not reproducible with 2.6.17 kernel.) The GoVault drive
timeout at the eject command before the flood of command, so that
timeout might be the cause, not the ATAPI err_mask. Will check more
about the real cause.

--
albert


-
: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux