Re: [Bugme-new] [Bug 9405] New: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Tue, 20 Nov 2007 10:43:59 -0600

On Tue, 2007-11-20 at 19:15 +0300, Vladislav Bolkhovitin wrote:
> James Bottomley wrote:
> > I'm not sure your conclusions necessarily follow your data.  What was
> > the reason for the TASK ABORTED (I'd guess QErr settings, right)?
> 
> It was my desire/curiosity during tests of SCST (http://scst.sf.net), 
> when it working with several initiators with different transports over 
> the same set of devices, each of them having with TAS bit in the control 
> mode page set. According to SAM, in this case TASK ABORTED status can be 
> returned at any time, similarly to QUEUE FULL, i.e. IMHO such command 
> just should be retried. But QUEUE FULL status handled well, but TASK 
> ABORTED leads to filesystem corruption.

So this is with a soft target implementation ... so it could be an
ordering issue inside the target that's causing the filesystem
corruption on error.

if you specifically set TAS=1 you're giving up the right to know what
caused the command termination.  With insufficient information, it's
really unsafe to simply retry, which is why the mid layer just returns
TASK ABORTED as an error.  If you set TAS=0 we'll get a check
condition/unit attention explaining what happened (usually commands
cleared by another initiator) and we'll explicitly do the right thing
based on the sense data.

One of my test suites has an initiator which randomly spits errors.
I've yet to see it cause an error that an ext3 journal can't recover
from.  So, if there's a genuine problem we need a nice test case to pass
to the filesystem people.

James

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html