Hey folks, Noticed this today and it has me stumped. I have a 10GB raw VM disk image that I've placed inside of an ext4-formatted RBD. When I do this, it gets corrupted in weird ways. I was prepared to show fsck results to show this, but then I found an easier way was just by looking at the sha1sum for the file. Here's what I see. disk image sitting on regular (non-RBD) ext4 filesystem: # sha1sum disk.img cfd37c33b9de926644f7b13e604374348662bc60 disk.img same disk image sitting in RBD #1 # cp -p disk.img /mnt/rbd1 # sha1sum /mnt/rbd1/disk.img cfd37c33b9de926644f7b13e604374348662bc60 disk.img Great, they match. But then comes the problematic RBD: # cp -p disk.img /mnt/rbd2 # sha1sum /mnt/rbd2/disk.img a28d0735c0f0863a3f84151122da75a56bf5022b disk.img They don't match. I can also confirm that fsck'ing the filesystem contained in disk.img reveals numerous errors in the latter case, while the system is clean in the first two. I'm running 0.48.2argonaut on this particular cluster. RBDs were mapped with the kernel client. Kernel is 3.2.0-29-generic, running in Ubuntu 12.04.1. The only weird thing I've observed is that while the copy was going to RBD #2, I saw this in ceph -w: 2013-02-11 22:18:14.134683 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034857 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.135159 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034858 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.136699 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034859 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.139479 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034860 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.139588 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034861 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.139667 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034862 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.139748 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034863 4.127 to osd.2 not [4,2] in e2459/2459 2013-02-11 22:18:14.139827 osd.2 [WRN] client.7830 10.40.30.0:0/1548040543 misdirected client.7830.1:48034864 4.127 to osd.2 not [4,2] in e2459/2459 I hadn't seen this one before. Full disclosure: I had a ceph node failure last week (a week ago today) where all three OSD processes on one of my nodes got killed by OOM. I haven't had a chance to go back and look for errors, gather logs, or ask the list for any advice on what went wrong. Restarting my OSDs brought everything back inline -- the cluster handled the failed OSDs just fine, with one exception. One of my RBDs went read-only/write-protected. Even after the cluster was back to HEALTH_OK, it remained read-only. I had to unmount, unmap, map, mount my RBD to get it back. It just so happens that that RBD is the one giving me problems now. So they could be related. =) It's a small cluster: # ceph -s health HEALTH_OK monmap e1: 3 mons at {a=10.40.30.0:6789/0,b=10.40.30.1:6789/0,c=10.40.30.2:6789/0}, election epoch 4, quorum 0,1,2 a,b,c osdmap e2459: 9 osds: 9 up, 9 in pgmap v9525714: 2880 pgs: 2880 active+clean; 2841 GB data, 5649 GB used, 11109 GB / 16758 GB avail mdsmap e1: 0/0/1 up # ceph osd tree dumped osdmap tree epoch 2459 # id weight type name up/down reweight -1 18 pool default -3 18 rack unknownrack -2 6 host ceph0 0 2 osd.0 up 1 1 2 osd.1 up 1 2 2 osd.2 up 1 -4 6 host ceph1 3 2 osd.3 up 1 4 2 osd.4 up 1 5 2 osd.5 up 1 -5 6 host ceph2 6 2 osd.6 up 1 7 2 osd.7 up 1 8 2 osd.8 up 1 But yeah, I'm just stumped about why files going into that particular RBD get corrupted. I tried a smaller file (~140MB) and it was fine. I haven't gotten to do enough testing to find the threshold for corruption. Or if it only happens for specific file types. I did a similar test with qcow2 images (10G virtual, 4.4GB actual), and the fsck results were the same -- immediate corruption inside that RBD. I did not capture the sha1sum for those files though. I expect they would differ. =) Thanks, - Travis _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com