> -----Original Message----- > From: Igor Fedotov [mailto:ifedotov@xxxxxxxxxxxx] > Sent: Thursday, March 31, 2016 10:18 AM > To: Allen Samuels <Allen.Samuels@xxxxxxxxxxx>; Sage Weil > <sage@xxxxxxxxxxxx> > Cc: ceph-devel <ceph-devel@xxxxxxxxxxxxxxx> > Subject: Re: Adding compression/checksum support for bluestore. > > > > On 31.03.2016 19:32, Allen Samuels wrote: > >> But do we really need to store checksums as metadata? What's about > >> pre(post)fixing 4K-4(?) blob with the checksum and store this pair to > >> the disk. IMO we always need checksum values along with blob data > >> thus let's store and read them together. This immediately eliminates > >> the question about the granularity and corresponding overhead... Have > >> I missed something? > > If you store them inline with the data then nothing lines up on boundaries > that the HW designers expect and you end up doing things like extra-copying > of every data buffer. This will kill performance. > > Perhaps you are right. > > But not sure I fully understand what HW designers you mean here. Are you > considering the case when Ceph is embedded into some hardware and > incoming RW requests always operate aligned data and supposed to have > the same alignment for data saved to disk? Dig into the direct I/O stuff. You'll see all sorts of places where the data is required to be either 512-byte or page-aligned. This stems from the HW implementations of the HBA, SCSI, SATA HW. > > IMHO proper data alignment in the incoming requests is a particular > case. Generally we don't have such a trait. Moreover compression > completely destroys it if any. Thus in many cases we can easily append > an additional data portion containing a checksum. > > > > > If you store them in a separate place (not in metadata, not contiguous to > data) then you'll have a full extra I/O that might even move the head > (yikes!). Plus you'll have to deal with the RMW of these tiny things. > Agree - that's not an option. > > Putting them in the metadata is really the only viable option. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html