Re: XFS filesystem corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 09, 2013 at 12:51:25PM -0600, Stan Hoeppner wrote:
> On 3/9/2013 3:11 AM, Dave Chinner wrote:
> > On Fri, Mar 08, 2013 at 12:59:22PM -0600, Stan Hoeppner wrote:
> >> On 3/8/2013 6:20 AM, Ric Wheeler wrote:
> >>>> Something that none of us mentioned WRT write barriers is that while the
> >>>> filesystem structure may avoid corruption when the power is cut, files
> >>>> may still be corrupted, in conditions such as any/all of these:
> >>
> >> I made it very clear I was discussing file corruption here, not
> >> filesystem corruption.  You already covered that base.  I was
> >> specifically addressing the fact that XFS performs barriers on metadata
> >> writes but not file data writes.
> > 
> > Actually, you're not correct there, either, Stan. ;)
> 
> With "either" you're implying I was incorrect twice, and I wasn't, not
> in whole anyway, maybe in part. ;)

The "either" was in reference to you correcting someone else...

> > XFS only issues cache flushes/FUA writes for log IO. Metadata IO is
> > done exactly the same way that data IO is done - without barriers.
> > It's because metadata lost in drive caches at the time of a crash is
> > rewritten by journal replay that filesystem corruption does not
> > occur.
> 
> Technical semantics.  Geeze, give the non dev a break now and then.  ;)

It's the technical semantics that matter when it comes to behaviour
at power loss.  That's why I pick on "technical semantics" - it's
makes your analysis and understanding of problems better, and that
means there's less for me to do in future ;)

>  Does everyone remember the transitive property of equality from math
> class decades ago?  It states "If A=B and B=C then A=C".  Thus if
> barrier writes to the journal protect the journal, and the journal
> protects metadata, then barrier writes to the journal protect metadata.

Yup, but the devil is in the detail - we don't protect individual
metadata writes at all and that difference is significant enough to
comment on.... :P

>  I had a detail incorrect, but not the big picture.  And I'd bet the OP
> is more interested in the big picture.  So surely I'd get a B or a C
> here, but certainly not an F.

Certainly a B+ - like I said, I'm being picky because you seem to
understand the details once explained... :)

> > As it is, if the application uses direct IO (likely, as it
> > sounds like video capture/editing/playout here) then log IO
> > will also ensure that the data written by the app is on disk (i.e.
> > that's ithe mechanism by which fsync works).
> 
> So this would be an interesting upside down case for XFS, as the file
> data may be intact, but the filesystem gets corrupted, the opposite of
> the design point.

Well, if barriers are working correctly, then there won't be any
filesystem corruption, either...

> >>> Also, if there are active writers, this is inherently racy. A better
> >>> script would unmount the file systems :)
> >>
> >> Yes, a umount would be even better.
> > 
> > Change the bios so that the power button does not cause a power down
> > so the OS can capture the button event and trigger an orderly
> > shutdown.
> 
> Dare I say "Dave you're incorrect". ;)

Heh.  Not so much incorrect as "unaware of the entire scope". I
browsed the thread and didn't pick up on this little detail...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux