Re: [Bugme-new] [Bug 9405] New: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2007-11-20 at 20:45 +0300, Vladislav Bolkhovitin wrote:
> James Bottomley wrote:
> >>>>>I'm not sure your conclusions necessarily follow your data.  What was
> >>>>>the reason for the TASK ABORTED (I'd guess QErr settings, right)?
> >>>>
> >>>>It was my desire/curiosity during tests of SCST (http://scst.sf.net), 
> >>>>when it working with several initiators with different transports over 
> >>>>the same set of devices, each of them having with TAS bit in the control 
> >>>>mode page set. According to SAM, in this case TASK ABORTED status can be 
> >>>>returned at any time, similarly to QUEUE FULL, i.e. IMHO such command 
> >>>>just should be retried. But QUEUE FULL status handled well, but TASK 
> >>>>ABORTED leads to filesystem corruption.
> >>>
> >>>So this is with a soft target implementation ... so it could be an
> >>>ordering issue inside the target that's causing the filesystem
> >>>corruption on error.
> >>
> >>Target offers no ordering guarantees for SIMPLE commands and frankly 
> >>says that to initiator via QUEUE ALGORITHM MODIFIER value 1 in the 
> >>control mode page. As we know, initiator doesn't use ORDERED tags (and 
> >>it really doesn't use them according to the logs), so if it's an 
> >>ordering issue, it's at the initiator's side.
> >>
> >>
> >>>if you specifically set TAS=1 you're giving up the right to know what
> >>>caused the command termination.  With insufficient information, it's
> >>>really unsafe to simply retry, which is why the mid layer just returns
> >>>TASK ABORTED as an error.  If you set TAS=0 we'll get a check
> >>>condition/unit attention explaining what happened (usually commands
> >>>cleared by another initiator) and we'll explicitly do the right thing
> >>>based on the sense data.
> >>
> >>But having TAS=1 is legal, right? So it should be handled well. If 
> >>TAS=0, TASK ABORTED can't be returned, it would be illegal. So, TASK 
> >>ABORTED status can only be returned with TAS=1.
> > 
> > Driving with your handbrake on is legal too ... that doesn't mean you
> > should do it ... and it certainly doesn't give you a legitimate
> > complaint against the manufacturer of your car for excessive brake pad
> > wear.
> > 
> > We handle TASK ABORTED as well as we can (by failing it).  For better
> > handling set TAS=0 and we'll handle the individual cases according to
> > the sense codes.
> 
> So, should I consider your words as you think that it's perfectly fine 
> to corrupt file system for devices with TAS=1? Absolutely legal devices, 
> repeat. Hence, in your opinion, no further investigation should be done?

Logic wouldn't support such a conclusion.

You have intertwined two issues

     1. How should the mid layer handle TASK ABORTED.  I think we've
        reached the point where returning I/O error is the best we can
        do, but if TAS=0 we could have used the sense data to do better.
     2. Should a request I/O error cause corruption in ext3 that can't
        be recovered by a journal replay.  I think the answer here is
        no, so there needs to be an easily reproducible test case to
        pass to the filesystem people.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux