Re: corruption of active mmapped files in btrfs snapshots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Alexandre Oliva (2013-03-21 03:14:02)
> On Mar 19, 2013, Alexandre Oliva <oliva@xxxxxxx> wrote:
> 
> > On Mar 19, 2013, Alexandre Oliva <oliva@xxxxxxx> wrote:
> >>> that is being processed inside the snapshot.
> 
> >> This doesn't explain why the master database occasionally gets similarly
> >> corrupted, does it?
> 
> > Actually, scratch this bit for now.  I don't really have proof that the
> > master database actually gets corrupted while it's in use
> 
> Scratch the “scratch this”.  The master database actually gets
> corrupted, and it's with recently-created files, created after earlier
> known-good snapshots.  So, it can't really be orphan processing, can it?

Right, it can't be orphan processing.

> 
> Some more info from the errors and instrumentation:
> 
> - no data syncing on the affected files is taking place.  it's just
>   memcpy()ing data in <4KiB-sized chunks onto mmap()ed areas,
>   munmap()ing it, growing the file with ftruncate and mapping a
>   subsequent chunk for further output
> 
> - the NULs at the end of pages do NOT occur at munmap/mmap boundaries as
>   I suspected at first, but they do coincide with the end of extents
>   that are smaller than the maximum compressed extent size.  So,
>   something's making btrfs flush pages to disk before the pages are
>   completely written (which is fine in principle), but apparently
>   failing to pick up subsequent changes to the pages (eek!)

With mmap the kernel can pick any given time to start writing out dirty
pages.  The idea is that if the application makes more changes the page
becomes dirty again and the kernel writes it again.

So the question is, can you trigger this without snapshots being done
at all?  I'll try to make an mmap tester here that hammers on the
related code.  We usually test this with fsx, which catches all kinds of
horrors.

-chris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux