Re: What is the state of filestore sloppy CRC?

Samuel Just <sam.just@xxxxxxxxxxx> · Tue, 25 Nov 2014 11:18:52 -0800



sloppy crc uses fs xattrs directly, omap won't help.
-Sam

On Tue, Nov 25, 2014 at 7:39 AM, Tomasz Kuzemko <tomasz.kuzemko@xxxxxxx> wrote:
> On Tue, Nov 25, 2014 at 07:10:26AM -0800, Sage Weil wrote:
>> On Tue, 25 Nov 2014, Tomasz Kuzemko wrote:
>> > Hello,
>> > as far as I can tell, Ceph does not make any guarantee that reads from an
>> > object return what was actually written to it. In other words, it does not
>> > check data integrity (except doing deep-scrub once every few days).
>> > Considering the fact that BTRFS is not production-ready, not many people use
>> > Ceph on top of ZFS, then the only option to have some sort of guarantee of
>> > integrity is to enable "filestore sloppy crc" option. Unfortunately the docs
>> > aren't too clear about this matter and "filestore sloppy crc" is not even
>> > documented, which is weird considering it's merged since Emperor.
>> >
>> > Getting back to my actual question - what is the state of "filestore sloppy
>> > crc"? Does someone actually use it in production? Are there any
>> > considerations one should make before enabling it? Is it safe to enable it
>> > on an existing cluster?
>>
>> We enable it in our automated QA, but do not know of anyone using it in
>> production and have not recommended it for that.  It is not intended to be
>> particularly fast and we didn't thoroughly analyze the xattr size
>> implications on the file systems people may run on.  Also note that it
>> simply fails (crashes) the OSD when it detects an error and has no
>> integration with scrub, which makes it not particularly friendly.
>
> We have run some initial tests of sloppy crc on our dev cluster and
> performance hit was in fact neglible (on SSD). We noticed also the
> crashing behavior on bad CRC, bad still I would prefer OSD to crash than
> to serve corrupted data to the client. So far we only had to modify
> upstart script to stop respawning OSD after a few crashes so we can
> detect the CRC error and let clients failover to another OSD.
>
> About xattr size limitations, as I understand it, when using omap no
> such limitations apply? Besides, considering default settings of 64k CRC
> block and 4M object size, only 64 additional metadata entries for CRC
> would be required.
>
>>
>> Note that I am working on a related patch set that will keep a persistent
>> checksum of the entire object that will interact directly with deep
>> scrubs.  It will not be as fine-grained but is intended for production
>> use and will cover the bulk of data that sits unmodified at rest for
>> extended periods.
>
> When is it planned to release this feature? Will it be included as point
> release to Giant, or should we expect it in Hammer?
>
>>
>> sage
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> --
> Tomasz Kuzemko
> tomasz.kuzemko@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com