FYI: I fixed it by using CHECKSUM_PARTIAL, and filling in csum_start and csum_offset in the skb. I'm just pinging back to the list for future googlers, sorry to bother you. -Kyle On Tue, Feb 23, 2010 at 11:04 AM, Kyle Hubert <khubert@xxxxxxxxx> wrote: > Hi, I have a question about a network device I wrote on an RDMA > network. The RDMA network has it's own CRCs at it's link level, and > ensures that my incoming RX packets are correct. So, on the RX side, I > use CHECKSUM_UNNECESSARY when I build the skbs. I also believed that I > could use NETIF_F_NO_CSUM on the TX side, and this would eliminate the > cpu costs associated with csumming packets. To be clear, the HW does > nothing to actually parse or handle csums in the upper protocols. This > all works great on the RDMA network. > > OK, that was the background for the driver I wrote. Here is where I > get into trouble. Recently, they started routing packets off the > internal network into the exterior network. All ICMP and TCP traffic > work just fine when it goes through the gateway, but UDP traffic comes > across the wire with bad checksums, and the forwarding node > recalculates bad csums on top of the bad csum. Confusing... Let's look > at some traceroute packets: > > 10.128.1.64: RDMA network originator > 172.30.73.74: gateway > 172.30.74.68: exterior node, recipient > > 23:28:30.158171 IP (tos 0x0, ttl 3, id 53918, offset 0, flags [none], > proto UDP (17), length 68) 10.128.1.64.64008 > 172.30.74.68.33442: > [bad udp cksum c5fb!] UDP, length 40 > 0x0000: 0001 0100 0141 0001 0100 0140 0800 4500 > 0x0010: 0044 d29e 0000 0311 e2e8 0a80 0140 ac1e > 0x0020: 4a44 fa08 82a2 0030 0264 4041 4243 4445 > 0x0030: 4647 4849 4a4b 4c4d 4e4f 5051 5253 5455 > 0x0040: 5657 5859 5a5b 5c5d 5e5f 6061 6263 6465 > 0x0050: 6667 > > 23:28:30.158211 IP (tos 0x0, ttl 2, id 53918, offset 0, flags [none], > proto UDP (17), length 68) 172.30.73.74.64008 > 172.30.74.68.33442: > [bad udp cksum c5fb!] UDP, length 40 > 0x0000: 0000 0c07 ac49 facd 0101 db22 0800 4500 > 0x0010: 0044 d29e 0000 0211 fa3f ac1e 494a ac1e > 0x0020: 4a44 fa08 82a2 0030 18bb 4041 4243 4445 > 0x0030: 4647 4849 4a4b 4c4d 4e4f 5051 5253 5455 > 0x0040: 5657 5859 5a5b 5c5d 5e5f 6061 6263 6465 > 0x0050: 6667 > > 12:45:08.465047 IP (tos 0x0, ttl 1, id 53918, offset 0, flags [none], > proto UDP (17), length 68) 172.30.73.74.64008 > 172.30.74.68.33442: > [bad udp cksum c5fb!] UDP, length 40 > 0x0000: 001d 0926 2331 00d0 003a fbfc 0800 4500 > 0x0010: 0044 d29e 0000 0111 fb3f ac1e 494a ac1e > 0x0020: 4a44 fa08 82a2 0030 18bb 4041 4243 4445 > 0x0030: 4647 4849 4a4b 4c4d 4e4f 5051 5253 5455 > 0x0040: 5657 5859 5a5b 5c5d 5e5f 6061 6263 6465 > 0x0050: 6667 > > As we can see, the original packet sent had a csum of 0x0264, which > I'm presuming the udp stack must fill in some value when NO_CSUM is > set. As far as I can tell that's a csum of the IP payload after the > UDP psuedo header+payload calculation is performed and written to the > UDP header (0xfe29 would be correct, and 0x0264 is the csum of the UDP > real header + payload where the real header has 0xfe29 filled in, or > maybe I'm looking for patterns in the noise). Since the csum isn't > zero when the gateway forwards the packet, it isn't recomputed > correctly there, and eventually the recipient node throws away the > packet because of the bad csum. > > How do I forward packets from an interior network without checksums > through a masquerading gateway so that it's a regular UDP/IP packet > when it lands? Also, why do all the checksums work for TCP/IP? Am I > uncovering a UDP bug? > > Thank you very much for your time, > -Kyle Hubert > -- To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html