Re: More data corruption issues with RBD (Ceph 0.61.2)

Josh Durgin <josh.durgin@xxxxxxxxxxx> · Thu, 13 Jun 2013 01:58:08 -0700

On 06/11/2013 11:59 AM, Guido Winkelmann wrote:
Hi,

I'm having issues with data corruption on RBD volumes again.

I'm using RBD volumes for virtual harddisks for qemu-kvm virtual machines.
Inside these virtual machines I have been running a C++ program (attached)
that fills a mounted filesystem with 1 Megabyte files of random data, while
using the SHA1-checksum of said random data as the names for those files.
After this, it reads all those files in again and checks their checksum.

Last time I ran this program, it would report 300 wrong digests out of ~50000.
I could not observe those corruptions when running the same test on a volume
backed by a qcow2 image on an NFS share.

This problem is fairly hard to reproduce, but it is reproducable. So far, the
combination that seems most likely to trigger this bug seems to be this:

- Run the test on a big (50GB) and freshly created volume (i.e. one that is
still mostly sparse)

So this is a plain format 1 or 2 image, with no cloning involved.

- Write the data with a very large number of concurrent threads (1000+)

Are you using rbd caching? If so, turning it off may help reproduce
faster if it's related to the number of individual requests (since the
cache may merge adjacent or overlapping requests).

- In the middle of writing, take down one OSD. It seems to matter which OSD
that is, so far I could only reproduce the bug taking down the third of three
OSDs

You're killing the OSD process, and not rebooting the host? Which
filesystem are the OSDs using?

My setup is Ceph 0.61.2 on three machines, each running one OSD and one MON.
The last one is also running an MDS. The ceph.conf file is attached.

I have just updated to 0.61.3 and plan on rerunning the test on that.
The platform is Fedora 18 in all cases with kernel 3.9.4-200.fc18.x86_64.

If it's reproducible it'd be great to get logs from all osds with
debug osd = 20, debug ms = 1, and debug filestore = 20.

Logging on the qemu side (debug rbd = 20, debug ms = 1) when cat'ing a
corrupt file might tell us which object to look at in the osd logs.

Thanks!
Josh

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com