On Mar 19, 2013, Alexandre Oliva <oliva@xxxxxxx> wrote: > On Mar 19, 2013, Alexandre Oliva <oliva@xxxxxxx> wrote: >>> that is being processed inside the snapshot. >> This doesn't explain why the master database occasionally gets similarly >> corrupted, does it? > Actually, scratch this bit for now. I don't really have proof that the > master database actually gets corrupted while it's in use Scratch the “scratch this”. The master database actually gets corrupted, and it's with recently-created files, created after earlier known-good snapshots. So, it can't really be orphan processing, can it? Some more info from the errors and instrumentation: - no data syncing on the affected files is taking place. it's just memcpy()ing data in <4KiB-sized chunks onto mmap()ed areas, munmap()ing it, growing the file with ftruncate and mapping a subsequent chunk for further output - the NULs at the end of pages do NOT occur at munmap/mmap boundaries as I suspected at first, but they do coincide with the end of extents that are smaller than the maximum compressed extent size. So, something's making btrfs flush pages to disk before the pages are completely written (which is fine in principle), but apparently failing to pick up subsequent changes to the pages (eek!) -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html