Re: Adding compression/checksum support for bluestore.

Igor Fedotov <ifedotov@xxxxxxxxxxxx> · Thu, 31 Mar 2016 20:18:24 +0300

On 31.03.2016 19:32, Allen Samuels wrote:
But do we really need to store checksums as metadata? What's about 
pre(post)fixing 4K-4(?) blob with the checksum and store this pair to 
the disk. IMO we always need checksum values along with blob data 
thus let's store and read them together. This immediately eliminates 
the question about the granularity and corresponding overhead... Have 
I missed something? 
If you store them inline with the data then nothing lines up on boundaries that the HW designers expect and you end up doing things like extra-copying of every data buffer. This will kill performance.

Perhaps you are right.

But not sure I fully understand what HW designers you mean here. Are you 
considering the case when Ceph is embedded into some hardware and 
incoming RW requests  always operate aligned data and supposed to have 
the same alignment for data saved to disk?

IMHO proper data alignment in the incoming requests is a particular 
case. Generally we don't have such a trait. Moreover compression 
completely destroys it if any. Thus in many cases we can easily append 
an additional data portion containing a checksum.

If you store them in a separate place (not in metadata, not contiguous to data) then you'll have a full extra I/O that might even move the head (yikes!). Plus you'll have to deal with the RMW of these tiny things.
Agree - that's not an option.
Putting them in the metadata is really the only viable option.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html