Re: CRC32 of messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 29, 2015 at 7:55 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> On Mon, Jun 29, 2015 at 8:31 AM, Dałek, Piotr
> <Piotr.Dalek@xxxxxxxxxxxxxx> wrote:
>>> -----Original Message-----
>>> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Erik G. Burrows
>>> Sent: Friday, June 26, 2015 6:49 PM
>>>
>>> All,
>>> Can someone explain to me the rationale for performing in-software CRC32
>>> hashes of all messages through the Pipe and AsyncMessage classes?
>>>
>>> On my servers, operf shows that 20% of the total CPU time in my benchmark
>>> tests are being spent in the librados ceph_crc32c_sctp function. I can see that
>>> the library is trying to use CPU accelerations if available, but what I'd like to
>>> understand is: why checksum the messages at all?
>>
>> As Somnath already wrote, you can disable CRC checking for messages. But they're also used for journals, among other things, so you'll always see some CPU usage spent on CRC32 calculations.
>>
>>> If the messages are local, there should not be any corruption at all, and if
>>> they are coming in over IP, then the kernel and NIC should do Layer-2/3 CRCs
>>> and reject any corrupted packets. So why re-CRC the messages at the Ceph
>>> layer?
>>
>> I can imagine data corruption coming from Ceph itself and not caught by IP layers, for example due to bug in Ceph code or mainboard/RAM failure. And it's a nice debug feature you can use when dealing with low-level code.
>>
>
> That's not to mention that the TCP checksum is remarkably weak. We've
> just had an incident where a broken router was quite efficiently
> corrupting something like 1/66 packets in a way which was invisible to
> the TCP checksum. Some example corruptions are here our report -- note
> that it's still a work in progress:
> https://cds.cern.ch/record/2026187/files/Adler32_Data_Corruption.pdf
>
> Thankfully CRC32-C /probably/ prevented this broken router from
> corrupting our Ceph volumes.

Yes, we have our own CRC32 checksum because loooong ago (before I
started!) Sage saw a lot of network corruption that wasn't being
caught by the TCP checksums so he added some to the Ceph message
stream. I can't tell you with any authority whatsoever how common that
problem is, but I don't think we're turning them off by default in
upstream. :)
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux