Re: What is the state of filestore sloppy CRC?

Tomasz Kuzemko <tomasz.kuzemko@xxxxxxx> · Tue, 25 Nov 2014 16:39:09 +0100

On Tue, Nov 25, 2014 at 07:10:26AM -0800, Sage Weil wrote:
> On Tue, 25 Nov 2014, Tomasz Kuzemko wrote:
> > Hello,
> > as far as I can tell, Ceph does not make any guarantee that reads from an
> > object return what was actually written to it. In other words, it does not
> > check data integrity (except doing deep-scrub once every few days).
> > Considering the fact that BTRFS is not production-ready, not many people use
> > Ceph on top of ZFS, then the only option to have some sort of guarantee of
> > integrity is to enable "filestore sloppy crc" option. Unfortunately the docs
> > aren't too clear about this matter and "filestore sloppy crc" is not even
> > documented, which is weird considering it's merged since Emperor.
> > 
> > Getting back to my actual question - what is the state of "filestore sloppy
> > crc"? Does someone actually use it in production? Are there any
> > considerations one should make before enabling it? Is it safe to enable it
> > on an existing cluster?
> 
> We enable it in our automated QA, but do not know of anyone using it in 
> production and have not recommended it for that.  It is not intended to be 
> particularly fast and we didn't thoroughly analyze the xattr size 
> implications on the file systems people may run on.  Also note that it 
> simply fails (crashes) the OSD when it detects an error and has no 
> integration with scrub, which makes it not particularly friendly.

We have run some initial tests of sloppy crc on our dev cluster and
performance hit was in fact neglible (on SSD). We noticed also the
crashing behavior on bad CRC, bad still I would prefer OSD to crash than
to serve corrupted data to the client. So far we only had to modify
upstart script to stop respawning OSD after a few crashes so we can
detect the CRC error and let clients failover to another OSD.

About xattr size limitations, as I understand it, when using omap no
such limitations apply? Besides, considering default settings of 64k CRC
block and 4M object size, only 64 additional metadata entries for CRC
would be required.

> 
> Note that I am working on a related patch set that will keep a persistent 
> checksum of the entire object that will interact directly with deep 
> scrubs.  It will not be as fine-grained but is intended for production 
> use and will cover the bulk of data that sits unmodified at rest for 
> extended periods.

When is it planned to release this feature? Will it be included as point
release to Giant, or should we expect it in Hammer?

>
> sage
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Tomasz Kuzemko
tomasz.kuzemko@xxxxxxx
Attachment:
signature.asc

Description: Digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com