Re: Does Bluestore backed OSD detect bit rot immediately when reading or only when scrubbed?

Christian Balzer <chibi@xxxxxxx> · Mon, 1 Apr 2019 15:38:49 +0900

Hello,

The answer is yes, as a quick search would have confirmed, for example:
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/osd-bluestore

It's also been discussed here for years. ^.^

Christian

On Mon, 1 Apr 2019 13:10:04 +0700 Igor Podlesny wrote:

> It's wide known that some filesystems (well, ok -- 2 of them: ZFS and
> Btrfs) detect bit rot on any read request, although, of course an
> admin can initiate "whole platter" scrubbing.
> 
> Before Bluestore CEPH could provide only "on demand" detection. I
> don't take into consideration imaginary setups with Btrfs or ZFS
> backed OSDs -- although Btrfs was supported it couldn't be trusted due
> to being too quirky and ZFS would mean way higher overhead, resources
> consumption; moreover, it conceptually doesn't fit well into CEPH's
> paradigm.
> So when scrub would find a mismatch that would trigger infamous HEALTH
> ERR state and require manual tinkering to resolve (although, in
> typical cases when 3 copies of placement group were used it seemed
> more logical to autofix it -- at least most of its users would do same
> choice in 99 % occurrences).
> 
> Since Bluestore I'd expect bitrot detection to be made on any read
> request as it's the case with Btrfs and ZFS. But expectations can be
> wrong no matter how logically they might seem, and that's why I've
> decide to clear it up. Can anyone tell for sure how does it work in
> CEPH with Bluestore?
> 
> If it's NOT the same as with those 2 CoW FSes and bit rot is detected
> with scrubbing only how prone to data corruption / loss would be
> 
> * 2 copies pools (2/1)
> * erasure coded pools (say 2, 1)
> 
> ?
> 
> Let's consider replicated pool with 2/1 where both data instances are
> up-to-date, and then one is found to be corrupted. Would be its csum
> mismatch enough to be "cured" semi-automatically with ceph pg repair?
> 
> What and how would happen in case erasure coded pool's data was found
> to be damaged as well?
> 
> -- 
> End of message. Next message?
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Rakuten Communications
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com