On Thursday 26 February 2004 12:52, MP M wrote: > Hello , > > I am working on TCP checksum offload , TSO and zero > copy with linux 2.6.1 kernel . > > IMHO I find that the code for TCP checksum offload and > TSO are already supported by the linux 2.6 kernel . I > arrived at this conclusion on seeing the presence of > flag CHECKSUM_HW and the #defines for NETIF_F_IP_CSUM > , NETIF_F_NO_CSUM and > NETIF_F_HW_CSUM . By default , it seems that > CHECKSUM_HW is enabled by default so that the TSO > supported driver will do the processing on the > ethernet card. > > Please correct me if I am wrong . > > In the driver for e1000 and tg3 , support for TSO is > already seen . > > But when I was testing the performance using ttcp > utility , I found some weird results. > I just want to share to obtain some feedback from some > experienced guys in this area who has already worked > in TSO ,TCP checksum offload . > > On the server machine I had my linux 2.6 kernel > running and it had e1000 Intel pro ethercard > support.Initially with ethtool utility I ensured that > the Tx and Rx checksum setting on e1000 is set to on . > I started ttcp utiltity in receive mode on the server > machine listening on my specified port . > > Next I pumped in data from my client machine using > ttcp utility in transmit mode to the server . > > I measured the time duration for data transfer to > happen . say x milliseconds. > > Next I set the tx and rx checksum on e1000 card using > ethtool , and repeated the above test with ttcp > utility .Since the content size is same and with tx/rx > checksum off on e1000 , I expected the time duaration > of data transfer from server to client to be x+some > delta . But surprisingly I am noticing the data > transfer at lesser time than x .(ie faster than before > with tx/rx checksum off on e1000 ) . > > I would appreciate if anyone could shed some light on > this odd behaviour . Not that odd - the local CPU is better at computing checksums than the interface. Note - there are some optimizations in the network stack that reduce the effort of actually computing checksums. In some cases, all that is done is to subtract/add those values that are changed in making replies/ACKs. This eliminates having to make a complete checksum evaluation. In others, the checksum is done during some other operation that requires a pass through the packet - making the checksum cost negligable. I'm not the person for a detailed, low level explaination, but I think this is what you are seeing. The hardware cannot take advantage of global optimizations - and what you are seeing is that difference. - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html