Hello Petr, ----- Le 4 Juin 24, à 12:13, Petr Bena petr@bena.rocks a écrit : > Hello, > > I wanted to try out (lab ceph setup) what exactly is going to happen > when parts of data on OSD disk gets corrupted. I created a simple test > where I was going through the block device data until I found something > that resembled user data (using dd and hexdump) (/dev/sdd is a block > device that is used by OSD) > > INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 | > hexdump -C > 00000000 6e 20 69 64 3d 30 20 65 78 65 3d 22 2f 75 73 72 |n id=0 > exe="/usr| > 00000010 2f 73 62 69 6e 2f 73 73 68 64 22 20 68 6f 73 74 |/sbin/sshd" > host| > > Then I deliberately overwrote 32 bytes using random data: > > INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/urandom of=/dev/sdd bs=32 > count=1 seek=33920 > > INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 | > hexdump -C > 00000000 25 75 af 3e 87 b0 3b 04 78 ba 79 e3 64 fc 76 d2 >|%u.>..;.x.y.d.v.| > 00000010 9e 94 00 c2 45 a5 e1 d2 a8 86 f1 25 fc 18 07 5a >|....E......%...Z| > > At this point I would expect some sort of data corruption. I restarted > the OSD daemon on this host to make sure it flushes any potentially > buffered data. It restarted OK without noticing anything, which was > expected. > > Then I ran > > ceph osd scrub 5 > > ceph osd deep-scrub 5 > > And waiting for all scheduled scrub operations for all PGs to finish. > > No inconsistency was found. No errors reported, scrubs just finished OK, > data are still visibly corrupt via hexdump. > > Did I just hit some block of data that WAS used by OSD, but was marked > deleted and therefore no longer used or am I missing something? Possibly, if you deep-scrubed all PGs. I remember marking bad sectors in the past and still getting a fsck success on ceph-bluestore-tool fsck. To be sure, you could overwrite the very same sector, stop the OSD and then: $ ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-X/ or (in containerized environment) $ cephadm shell --name osd.X ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-X/ osd.X being the OSD associated to drive /dev/sdd. Regards, Frédéric. > I would expect CEPH to detect disk corruption and automatically replace the > invalid data with a valid copy? > > I use only replica pools in this lab setup, for RBD and CephFS. > > Thanks > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx