Re: corruption of active mmapped files in btrfs snapshots

Sage Weil <sage@xxxxxxxxxxx> · Fri, 22 Mar 2013 10:18:21 -0700 (PDT)

On Fri, 22 Mar 2013, Chris Mason wrote:
> Quoting Alexandre Oliva (2013-03-22 10:17:30)
> > On Mar 22, 2013, Chris Mason <clmason@xxxxxxxxxxxx> wrote:
> > 
> > > Are you using compression in btrfs or just in leveldb?
> > 
> > btrfs lzo compression.
> 
> Perfect, I'll focus on that part of things.
> 
> > 
> > > I'd like to take snapshots out of the picture for a minute.
> > 
> > That's understandable, I guess, but I don't know that anyone has ever
> > got the problem without snapshots.  I mean, even when the master copy of
> > the database got corrupted, snapshots of the subvol containing it were
> > being taken every now and again, because that's the way ceph works.
> 
> Hopefully Sage can comment, but the basic idea is that if you snapshot a
> database file the db must participate.  If it doesn't, it really is the
> same effect as crashing the box.
> 
> Something is definitely broken if we're corrupting the source files
> (either with or without snapshots), but avoiding incomplete writes in
> the snapshot files requires synchronization with the db.

In this case, we quiesce write activity, call leveldb's sync(), take the 
snapshot, and then continue.

(FWIW, this isn't the first time we've heard about leveldb corruption, but 
in each case we've looked into the user had the btrfs compression 
enabled.... so I suspect that's the right avenue of investigation!)

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html