[Bug 13594] SMART responses for SATA disks on SAS get interpreted as errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://bugzilla.kernel.org/show_bug.cgi?id=13594





--- Comment #2 from Anonymous Emailer <anonymous@xxxxxxxxxxxxxxxxxxxx>  2009-06-21 18:55:06 ---
Reply-To: James.Bottomley@xxxxxxxxxxxxxxxxxxxxx

On Sun, 2009-06-21 at 13:47 -0500, James Bottomley wrote:
> >   [  811.091916] sd 0:0:0:0: [sda] Sense Key : Recovered Error [current]
> > [descriptor]
> >   [  811.099807] Descriptor sense data with sense descriptors (in hex):
> >   [  811.106175]         72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00
> >   [  811.113262]         00 4f 00 c2 00 50
> >   [  811.117379] sd 0:0:0:0: [sda] Add. Sense: ATA pass through information
> > available
> 
> This is a message the kernel prints out on all recovered error returns
> (except those marked REQ_QUIET).  It's purely informational and doesn't
> affect return processing of the command at all, so the kernel is
> actually treating this as a successful completion not an error.
> 
> > I've tried upgrading to the newest firmware (1.28.02.00, 05-MAY-2009), but
> > all that changed is that the hex dump was added to the error message.
> > 
> > Whenever this happens, it appears like all the disks “hiccup” and the kernel
> > loses contact with the controller for a small while. If too many of these
> > happen at once, eventually disks start falling off RAIDs, and the entire
> > machine goes down. It looks to me as if these messages should simply not be
> > treated as errors by the kernel -- smartctl explicitly asks for a response even
> > if the command doesn't fail (by setting CK_COND), so the response probably
> > shouldn't be taken as an error.
> 
> So this sounds like the bug ... however, for the LSI card, this bug will
> be in the SAT layer in the fusion firmware.  I can shut the kernel up by
> making the recovered error processing clause look for 01/00/1D as well
> as REQ_QUIET, but it won't affect this problem.

Actually quieting the message is trivially easy, try this.

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f3c4089..a0235c9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -774,7 +774,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int
good_bytes)
      * is what gets returned to the user
      */
     if (sense_valid && sshdr.sense_key == RECOVERED_ERROR) {
-        if (!(req->cmd_flags & REQ_QUIET))
+        if (!(req->cmd_flags & REQ_QUIET) &&
+            !(sshdr.asc == 0x00 && sshdr.ascq == 0x1d))
             scsi_print_sense("", cmd);
         result = 0;
         /* BLOCK_PC may have set error */

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux