More data corruption issues with RBD (Ceph 0.61.2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm having issues with data corruption on RBD volumes again.

I'm using RBD volumes for virtual harddisks for qemu-kvm virtual machines. 
Inside these virtual machines I have been running a C++ program (attached) 
that fills a mounted filesystem with 1 Megabyte files of random data, while 
using the SHA1-checksum of said random data as the names for those files. 
After this, it reads all those files in again and checks their checksum.

Last time I ran this program, it would report 300 wrong digests out of ~50000. 
I could not observe those corruptions when running the same test on a volume 
backed by a qcow2 image on an NFS share.

This problem is fairly hard to reproduce, but it is reproducable. So far, the 
combination that seems most likely to trigger this bug seems to be this:

- Run the test on a big (50GB) and freshly created volume (i.e. one that is 
still mostly sparse)
- Write the data with a very large number of concurrent threads (1000+)
- In the middle of writing, take down one OSD. It seems to matter which OSD 
that is, so far I could only reproduce the bug taking down the third of three 
OSDs

My setup is Ceph 0.61.2 on three machines, each running one OSD and one MON. 
The last one is also running an MDS. The ceph.conf file is attached.

I have just updated to 0.61.3 and plan on rerunning the test on that.
The platform is Fedora 18 in all cases with kernel 3.9.4-200.fc18.x86_64.

The client is a Fedora 17 based machine with Qemu 1.2.0-14.fc17 and Ceph 
0.61.2. (The yum repository for Fedora 17 does not appear to contain 0.61.3 
yet.)

Regards,

	Guido
; global
[global]
        max open files = 131072
        log file = /var/log/ceph/$name.log
        ; log_to_syslog = true        ; uncomment this line to log to syslog
        pid file = /var/run/ceph/$name.pid

	auth client required = cephx
	auth service required = cephx
	auth cluster required = cephx

; monitors
[mon]
        mon data = /mondata/$name

[mon.alpha]
	host = storage1
	mon addr = 10.6.224.129:6789

[mon.beta]
	host = storage2
	mon addr = 10.6.224.130:6789

[mon.gamma]
	host = storage3
	mon addr = 10.6.224.131:6789

; mds
[mds]
	; where the mds keeps it's secret encryption keys
	keyring = /mdsdata/keyring.$name

;[mds.alpha]
;	host = storage1

;[mds.beta]
;	host = storage2

[mds.gamma]
	host = storage3

; osd
[osd]
	osd data = /osddata/$name

	osd journal = /journaldata/$name/journal
	osd journal size = 1000 ; journal size, in megabytes

        ; If you want to run the journal on a tmpfs, disable DirectIO
        journal dio = false

        osd recovery max active = 5

	keyring = /osddata/$name/keyring

	filestore fiemap = false

[osd.0]
	host = storage1
	cluster addr = 10.6.224.193
	public addr = 10.6.224.129

[osd.1]
	host = storage2
	cluster addr = 10.6.224.194
	public addr = 10.6.224.130

[osd.2]
	host = storage3
	cluster addr = 10.6.224.195
	public addr = 10.6.224.131

[client]   ; userspace client
;      debug ms = 1
;      debug client = 10

Attachment: iotest-threaded.tar.bz2
Description: application/bzip-compressed-tar

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux