Re: The TCP and UDP checksum algorithm may soon need updating

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I got requests for citations/sources.  So here's what's known.

First, background reading from 20 years ago -- shows the distribution of errors that we saw in TCP connections then.  Note while bit errors happen, far more are bytes or larger.  This paper was also the origin of the idea of putting 32-bit md5 sums on files...

@inproceedings{10.1145/347059.347561, author = {Stone, Jonathan and Partridge, Craig}, title = {When the CRC and TCP Checksum Disagree}, year = {2000}, isbn = {1581132239}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = "" href="https://doi.org/10.1145/347059.347561">https://doi.org/10.1145/347059.347561}, doi = {10.1145/347059.347561}, booktitle = {Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication}, pages = {309–319}, numpages = {11}, location = {Stockholm, Sweden}, series = {SIGCOMM ’00} }

OK, on to what people are seeing today.  This shows that 1 in every 121 file transfers FTP delivers a file that, when you do the md5 sum, turns out not to match the original (note there are multiple possible reasons, but TCP checksum is a strong candidate).

@inproceedings{Liu:2018:CSD:3208040.3208053,

 author = {Liu, Zhengchun and Kettimuthu, Rajkumar and Foster, Ian and Rao, Nageswara S. V.},

 title = {Cross-geography Scientific Data Transferring Trends and Behavior},

 booktitle = {Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing},

 series = {HPDC '18},

 year = {2018},

 isbn = {978-1-4503-5785-2},

 location = {Tempe, Arizona},

 pages = {267--278},

 numpages = {12},

 url = "" href="http://doi.acm.org/10..1145/3208040.3208053">http://doi.acm.org/10.1145/3208040.3208053},

 doi = {10.1145/3208040.3208053},

 acmid = {3208053},

 publisher = {ACM},

 address = {New York, NY, USA},

 keywords = {GridFTP, file transfer, usage management, wide area network},

} 


On a related point, about 60% of big file transfers in a major energy network are failing (checksums are one of the suspects);


@inproceedings{shannigrahi2017request,

  title={Request aggregation, caching, and forwarding strategies for improving large climate data distribution with NDN: a case study},

  author={Shannigrahi, Susmit and Fan, Chengyu and Papadopoulos, Christos},

  booktitle={Proceedings of the 4th ACM Conference on Information-Centric Networking},

  pages={54--65},

  year={2017},

  organization={ACM}

}


This reference also sees high error rates:

Kettimuthu, Rajkumar, et al. "Transferring a Petabyte in a Day." Future Generation Computer Systems 88 (2018): 191-198.

Anecdotally, folks are reporting some middlebox vendors are not updating the TCP checksum but rather letting the outbound interface simply recompute the entire checksum -- which means that if the TCP segment gets damaged during middlebox handling, the middlebox will slap a valid checksum on bad data.

As I noted -- the literature is slim, which is why a team I'm on is going to seek more comprehensive error collection which actually captures the errors in the data, so we can see what kinds of errors are causing trouble.

Craig
--
*****
Craig Partridge's email account for professional society activities and mailing lists.

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux