On Tue, 29 Apr 2008, Mark Seger wrote: > I was running some network tests taking samples every 10 seconds on an > nfs server connected to high-speed storage and a 10G NIC. The data was > read with collectl, which reads the data from /proc/net/dev. When I > look at the average packet sizes I come up with over 6K and jumbo frames > are not enabled. Naturally my first suspicion was collectl's math so I > went back to the original numbers and here's what I collected - the > 'Net' at the beginning of each line is something I preface each line > with so I can tell which file it came from as I collect data from lots > of places. Anyhow, here are a couple of samples (don't know how they'll > show up after terminal wrapping): > > Inter-| Receive > | Transmit > face |bytes packets errs drop fifo frame compressed > multicast|bytes packets errs drop fifo colls carrier compressed > Net eth2:1033991246323 770657768 0 885 0 0 > 0 0 209226011736 398954938 0 0 0 0 0 0 > Net eth2:1034142397679 773022462 0 885 0 0 > 0 0 215709166026 399913993 0 0 0 0 0 0 > Net eth2:1034294532107 775402505 0 886 0 0 > 0 0 222234389176 400880022 0 0 0 0 0 0 > > If I do the math for the first 2 samples it comes out as > 215709166026-209226011736=6483154290 bytes and > 399913993-398954938=959055 packets. The result of > 6483154290/959055=6760 bytes/packet. And this is not just for a couple > of samples. Here's an example of collectl's output: > > # Num Name InPck InErr OutPck OutErr Mult ICmp > OCmp IKB OKB > 13:00:00 3 eth2: 236469 0 95905 0 0 0 0 > 14760 633120 > 13:00:10 3 eth2: 238004 0 96602 0 0 0 0 > 14856 637228 > 13:00:20 3 eth2: 238796 0 92460 0 0 0 0 > 14906 639296 > 13:00:29 3 eth2: 230974 0 93103 0 0 0 0 > 14419 618655 > 13:00:40 3 eth2: 234701 0 93414 0 0 0 0 > 14651 628492 > 13:00:50 3 eth2: 236121 0 94527 0 0 0 0 > 14739 632191 > 13:00:59 3 eth2: 237001 0 92259 0 0 0 0 > 14795 634433 > > note that these numbers as reported as KBs/sec and so have been divided > by 10 from the raw numbers shown in /proc/net/dev. > If I take a look at an earlier part of the same test when I'm doing > writes (the system sees them as InPck/IKB I see: > > # Num Name InPck InErr OutPck OutErr Mult ICmp > OCmp IKB OKB > 12:30:00 3 eth2: 268813 0 142581 0 0 0 0 > 383735 9100 > 12:30:10 3 eth2: 258240 0 136903 0 0 0 0 > 368605 8734 > 12:30:20 3 eth2: 283324 0 150269 0 0 0 0 > 404486 9587 > 12:30:30 3 eth2: 278165 0 147478 0 0 0 0 > 396937 9405 > 12:30:40 3 eth2: 280848 0 148938 0 0 0 0 > 400896 9506 > > and if I do the math on IKB (383735*1024/268813)=1462 which makes a > whole lot more sense. Any ideas as to what would cause the driver > incorrectly count the packets? I know the byte counts are correct > because this is an nfs server and the disk i/o rates are consistent > with the network rates. Just guessing, but perhaps on the transmit side TSO is causing packet aggregation to the NIC driver. You could try a test with disabling TSO on eth2 by: ethtool -K eth2 tso off Of course this might have some performance ramifications. -Bill -- To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html