Re: [Bugme-new] [Bug 9405] New: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



James Bottomley wrote:
I'm not sure your conclusions necessarily follow your data.  What was
the reason for the TASK ABORTED (I'd guess QErr settings, right)?

It was my desire/curiosity during tests of SCST (http://scst.sf.net), when it working with several initiators with different transports over the same set of devices, each of them having with TAS bit in the control mode page set. According to SAM, in this case TASK ABORTED status can be returned at any time, similarly to QUEUE FULL, i.e. IMHO such command just should be retried. But QUEUE FULL status handled well, but TASK ABORTED leads to filesystem corruption.

So this is with a soft target implementation ... so it could be an
ordering issue inside the target that's causing the filesystem
corruption on error.

Target offers no ordering guarantees for SIMPLE commands and frankly says that to initiator via QUEUE ALGORITHM MODIFIER value 1 in the control mode page. As we know, initiator doesn't use ORDERED tags (and it really doesn't use them according to the logs), so if it's an ordering issue, it's at the initiator's side.


if you specifically set TAS=1 you're giving up the right to know what
caused the command termination.  With insufficient information, it's
really unsafe to simply retry, which is why the mid layer just returns
TASK ABORTED as an error.  If you set TAS=0 we'll get a check
condition/unit attention explaining what happened (usually commands
cleared by another initiator) and we'll explicitly do the right thing
based on the sense data.

But having TAS=1 is legal, right? So it should be handled well. If TAS=0, TASK ABORTED can't be returned, it would be illegal. So, TASK ABORTED status can only be returned with TAS=1.

Driving with your handbrake on is legal too ... that doesn't mean you
should do it ... and it certainly doesn't give you a legitimate
complaint against the manufacturer of your car for excessive brake pad
wear.

We handle TASK ABORTED as well as we can (by failing it).  For better
handling set TAS=0 and we'll handle the individual cases according to
the sense codes.

So, should I consider your words as you think that it's perfectly fine to corrupt file system for devices with TAS=1? Absolutely legal devices, repeat. Hence, in your opinion, no further investigation should be done?

One of my test suites has an initiator which randomly spits errors.
I've yet to see it cause an error that an ext3 journal can't recover
from.  So, if there's a genuine problem we need a nice test case to pass
to the filesystem people.

If you need a clear testcase (IMHO, in this case it isn't needed, because it's clear without it), I can prepare a patch for SCST to randomly return TASK ABORTED status.

You can get the latest version of SCST and the target drivers using SVN:

$ svn co https://scst.svn.sourceforge.net/svnroot/scst

There's no real need to bother with setting all this up ... a simple
initiator modification randomly to return TASK ABORTED should suffice.

Yes, you're right. Then, I suppose, Mike Christie should be the best person to do it?

Vlad
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux