Re: Testing CEPH scrubbing / self-healing capabilities

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Fri, 7 Jun 2024 15:25:52 +0200 (CEST)

Hello Petr,

----- Le 4 Juin 24, à 12:13, Petr Bena petr@bena.rocks a écrit :

> Hello,
> 
> I wanted to try out (lab ceph setup) what exactly is going to happen
> when parts of data on OSD disk gets corrupted. I created a simple test
> where I was going through the block device data until I found something
> that resembled user data (using dd and hexdump) (/dev/sdd is a block
> device that is used by OSD)
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 |
> hexdump -C
> 00000000  6e 20 69 64 3d 30 20 65  78 65 3d 22 2f 75 73 72  |n id=0
> exe="/usr|
> 00000010  2f 73 62 69 6e 2f 73 73  68 64 22 20 68 6f 73 74 |/sbin/sshd"
> host|
> 
> Then I deliberately overwrote 32 bytes using random data:
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/urandom of=/dev/sdd bs=32
> count=1 seek=33920
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 |
> hexdump -C
> 00000000  25 75 af 3e 87 b0 3b 04  78 ba 79 e3 64 fc 76 d2
>|%u.>..;.x.y.d.v.|
> 00000010  9e 94 00 c2 45 a5 e1 d2  a8 86 f1 25 fc 18 07 5a
>|....E......%...Z|
> 
> At this point I would expect some sort of data corruption. I restarted
> the OSD daemon on this host to make sure it flushes any potentially
> buffered data. It restarted OK without noticing anything, which was
> expected.
> 
> Then I ran
> 
> ceph osd scrub 5
> 
> ceph osd deep-scrub 5
> 
> And waiting for all scheduled scrub operations for all PGs to finish.
> 
> No inconsistency was found. No errors reported, scrubs just finished OK,
> data are still visibly corrupt via hexdump.
> 
> Did I just hit some block of data that WAS used by OSD, but was marked
> deleted and therefore no longer used or am I missing something?

Possibly, if you deep-scrubed all PGs. I remember marking bad sectors in the past and still getting a fsck success on ceph-bluestore-tool fsck.

To be sure, you could overwrite the very same sector, stop the OSD and then:

$ ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-X/

or (in containerized environment)

$ cephadm shell --name osd.X ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-X/

osd.X being the OSD associated to drive /dev/sdd.

Regards,
Frédéric.

> I would expect CEPH to detect disk corruption and automatically replace the
> invalid data with a valid copy?
> 
> I use only replica pools in this lab setup, for RBD and CephFS.
> 
> Thanks
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx