On Thu, Aug 16, 2018 at 08:05:50PM +0800, maowenan wrote: > On 2018/8/16 19:39, Michal Kubecek wrote: > > > > I suspect you may be doing something wrong with your tests. I checked > > the segmentsmack testcase and the CPU utilization on receiving side > > (with sending 10 times as many packets as default) went down from ~100% > > to ~3% even when comparing what is in stable 4.4 now against older 4.4 > > kernel. > > There seems no obvious problem when you send packets with default > parameter in Segmentsmack POC, Which is also very related with your > server's hardware configuration. Please try with below parameter to > form OFO packets I did and even with these (questionable, see below) changes, I did not get more than 10% (of one core) by receiving ksoftirqd. > for (i = 0; i < 1024; i++) // 128->1024 ... > usleep(10*1000); // Adjust this and packet count to match the target!, sleep 100ms->10ms The comment in the testcase source suggests to do _one_ of these two changes so that you generate 10 times as many packets as the original testcase. You did both so that you end up sending 102400 packets per second. With 55 byte long packets, this kind of attack requires at least 5.5 MB/s (44 Mb/s) of throughput. This is no longer a "low packet rate DoS", I'm afraid. Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697). What I can see, though, is that with current stable 4.4 code, modified testcase which sends something like 2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ... I quickly eat 6 MB of memory for receive queue of one socket while earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with Takashi's follow-up yet but I'm pretty sure it will help while preserving nice performance when using the original segmentsmack testcase (with increased packet ratio). Michal Kubecek