On 31.03.2016 19:32, Allen Samuels wrote:
But do we really need to store checksums as metadata? What's about
pre(post)fixing 4K-4(?) blob with the checksum and store this pair to
the disk. IMO we always need checksum values along with blob data
thus let's store and read them together. This immediately eliminates
the question about the granularity and corresponding overhead... Have
I missed something?
If you store them inline with the data then nothing lines up on boundaries that the HW designers expect and you end up doing things like extra-copying of every data buffer. This will kill performance.
Perhaps you are right.
But not sure I fully understand what HW designers you mean here. Are you
considering the case when Ceph is embedded into some hardware and
incoming RW requests always operate aligned data and supposed to have
the same alignment for data saved to disk?
IMHO proper data alignment in the incoming requests is a particular
case. Generally we don't have such a trait. Moreover compression
completely destroys it if any. Thus in many cases we can easily append
an additional data portion containing a checksum.
If you store them in a separate place (not in metadata, not contiguous to data) then you'll have a full extra I/O that might even move the head (yikes!). Plus you'll have to deal with the RMW of these tiny things.
Agree - that's not an option.
Putting them in the metadata is really the only viable option.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html