On Tue, Nov 25, 2014 at 07:10:26AM -0800, Sage Weil wrote: > On Tue, 25 Nov 2014, Tomasz Kuzemko wrote: > > Hello, > > as far as I can tell, Ceph does not make any guarantee that reads from an > > object return what was actually written to it. In other words, it does not > > check data integrity (except doing deep-scrub once every few days). > > Considering the fact that BTRFS is not production-ready, not many people use > > Ceph on top of ZFS, then the only option to have some sort of guarantee of > > integrity is to enable "filestore sloppy crc" option. Unfortunately the docs > > aren't too clear about this matter and "filestore sloppy crc" is not even > > documented, which is weird considering it's merged since Emperor. > > > > Getting back to my actual question - what is the state of "filestore sloppy > > crc"? Does someone actually use it in production? Are there any > > considerations one should make before enabling it? Is it safe to enable it > > on an existing cluster? > > We enable it in our automated QA, but do not know of anyone using it in > production and have not recommended it for that. It is not intended to be > particularly fast and we didn't thoroughly analyze the xattr size > implications on the file systems people may run on. Also note that it > simply fails (crashes) the OSD when it detects an error and has no > integration with scrub, which makes it not particularly friendly. We have run some initial tests of sloppy crc on our dev cluster and performance hit was in fact neglible (on SSD). We noticed also the crashing behavior on bad CRC, bad still I would prefer OSD to crash than to serve corrupted data to the client. So far we only had to modify upstart script to stop respawning OSD after a few crashes so we can detect the CRC error and let clients failover to another OSD. About xattr size limitations, as I understand it, when using omap no such limitations apply? Besides, considering default settings of 64k CRC block and 4M object size, only 64 additional metadata entries for CRC would be required. > > Note that I am working on a related patch set that will keep a persistent > checksum of the entire object that will interact directly with deep > scrubs. It will not be as fine-grained but is intended for production > use and will cover the bulk of data that sits unmodified at rest for > extended periods. When is it planned to release this feature? Will it be included as point release to Giant, or should we expect it in Hammer? > > sage > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Tomasz Kuzemko tomasz.kuzemko@xxxxxxx
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com