System Info: CPU: Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz Network Adapter/NIC: Intel X710 Driver: i40e Kernel version: 5.8.15 OS: Fedora 33 It’s worth noting that we tried expanding the DDIO to full (0x7ff) and a little more than half (0x7f0) with no material effect. - Neal On Thu, Apr 8, 2021 at 3:32 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > Neal Shukla <nshukla@xxxxxxxxxxxxx> writes: > > > We’ve been introducing bpf_tail_call’s into our XDP programs and have run into > > packet loss and latency increases when performing load tests. After profiling > > our code we’ve come to the conclusion that this is the problem area in our code: > > `int layer3_protocol = bpf_ntohs(ethernet_header->h_proto);` > > > > This is the first time we read from the packet in the first XDP program. We have > > yet to make a tail call at this point. However, we do write into the metadata > > section prior to this line. > > > > How We Profiled Our Code: > > To profile our code, we used https://github.com/iovisor/bpftrace. We ran this > > command while sending traffic to our machine: > > `sudo bpftrace bpftrace -e 'profile:hz:99 { @[kstack] = count(); }' > > > /tmp/stack_samples.out` > > > > From there we got a kernel stack trace with the most frequently counted spots at > > the bottom of the output file. The most commonly hit spot asides from the CPU > > idle look like: > > ``` > > @[ > > bpf_prog_986b0b3beb6f0873_some_program+290 > > i40e_napi_poll+1897 > > net_rx_action+309 > > __softirqentry_text_start+202 > > run_ksoftirqd+38 > > smpboot_thread_fn+197 > > kthread+283 > > ret_from_fork+34 > > ]: 8748 > > ``` > > > > We then took the program id and ran this command to retrieve the jited code: > > `sudo bpftool prog dump jited tag 986b0b3beb6f0873` > > > > By converting the decimal offset (290) from the stack trace to hex format (122) > > we found the line which it’s referring to in the jited code: > > ``` > > 11d: movzbq 0xc(%r15),%rsi > > 122: movzbq 0xd(%r15),%rdi > > 127: shl $0x8,%rdi > > 12b: or %rsi,%rdi > > 12e: ror $0x8,%di > > 132: movzwl %di,%edi > > ``` > > We've mapped this portion to refer to the line mentioned earlier: > > `int layer3_protocol = bpf_ntohs(ethernet_header->h_proto);` > > > > 1) Are we correctly profiling our XDP programs? > > > > 2) Is there a reason why our first read into the packet would cause this issue? > > And what would be the best way to solve the issue? > > We've theorized it may have to do with cache or TLB misses as we've added a lot > > more instructions to our programs. > > Yeah, this sounds like a caching issue. What system are you running this > on? Intel's DDIO feature that DMAs packets directly to L3 cache tends to > help with these sorts of things, but maybe your system doesn't have > that, or it's not being used for some reason? > > Adding a few other people who have a better grasp of these details than > me, in the hope that they can be more helpful :) > > -Toke >