On Thu, May 13, 2021 at 11:46 AM Jamal Hadi Salim <jhs@xxxxxxxxxxxx> wrote: > > On 2021-05-12 6:43 p.m., Jamal Hadi Salim wrote: > > > > > Will run some tests tomorrow to see the effect of batching vs nobatch > > and capture cost of syscalls and cpu. > > > > So here are some numbers: > Processor: Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz > This machine is very similar to where a real deployment > would happen. > > Hyperthreading turned off so we can dedicate the core to the > dumping process and Performance mode on, so no frequency scaling > meddling. > Tests were ran about 3 times each. Results eye-balled to make > sure deviation was reasonable. > 100% of the one core was used just for dumping during each run. I checked with Cilium users here at Bytedance, they actually observed 100% CPU usage too. > > bpftool does linear retrieval whereas our tool does batch dumping. > bpftool does print the dumped results, for our tool we just count > the number of entries retrieved (cost would have been higher if > we actually printed). In any case in the real setup there is > a processing cost which is much higher. > > Summary is: the dumping is problematic costwise as the number of > entries increase. While batching does improve things it doesnt > solve our problem (Like i said we have upto 16M entries and most > of the time we are dumping useless things) Thank you for sharing these numbers! Hopefully they could convince people here to accept the bpf timer. I will include your use case and performance number in my next update.