Re: getting I/O errors in super_written()...any ideas what would cause this?

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Sat, 08 Dec 2012 18:08:06 +0000

On Thu, 2012-12-06 at 12:15 -0600, Chris Friesen wrote:
> On 12/05/2012 03:20 AM, James Bottomley wrote:
> > On Tue, 2012-12-04 at 16:00 -0600, Chris Friesen wrote:
> >> As another data point, it looks like we may be doing a SEND DIAGNOSTIC
> >> command specifying the default self-test in addition to the background
> >> short self-test.  This seems a bit risky and excessive to me, but
> >> apparently the guy that wrote it is no longer with the company.
> >
> > This is a really bad idea.  A lot of disks go out to lunch until the
> > diagnostics complete (the same goes for SMART diagnostics).  This means
> > that if you do diagnostics on a running device, the drivers start to get
> > timeouts on commands which are queued waiting for diagnostics to
> > complete ... if those go over the standard SCSI timeouts, we'll start to
> > try error recovery and likely have the disaster you see above.
> 
> So it turns out that our problems are intermittently triggered when 
> running the default self test.  This agrees with the statement in 
> sg_senddiag to not do foreground self-tests on disks with mounted 
> filesystems.
> 
> We seem to be able to do background short self-tests (ie, SEND 
> DIAGNOSTIC command with self-test code of 001b and ST code of 0b) 
> without causing any problems.  Is this pushing our luck or is this 
> something that should work according to the spec and the linux stack?

No one can tell you this.  The specs don't say what should happen on a
diagnostic, how long it will take or how disruptive to the I/O flow it
is.

> The scsi spec indicates that in this case for most commands the test 
> will be paused and the command executed within 2 seconds, but I don't 
> know what the normal scsi timeouts are.

2 Seconds can be an eternity if you're pumping huge amounts of data to a
disk: it causes a burp in the I/O chain which propagates back up the
stack with unpredictable knock on consequences.  The standard SCSI
timeouts are configurable under sysfs, but they're 30s.  It is known
that EMC recommends 60s timeouts for arrays, for instance (which is why
they're configurable).

James

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html