Hangbin Liu <liuhangbin@xxxxxxxxx> writes: > On Wed, May 06, 2020 at 12:00:08PM +0200, Toke Høiland-Jørgensen wrote: >> > No, I haven't test the performance. Do you have any suggestions about how >> > to test it? I'd like to try forwarding pkts to 10+ ports. But I don't know >> > how to test the throughput. I don't think netperf or iperf supports >> > this. >> >> What I usually do when benchmarking XDP_REDIRECT is to just use pktgen >> (samples/pktgen in the kernel source tree) on another machine, >> specifically, like this: >> >> ./pktgen_sample03_burst_single_flow.sh -i enp1s0f1 -d 10.70.2.2 -m ec:0d:9a:db:11:35 -t 4 -s 64 >> >> (adjust iface, IP and MAC address to your system, of course). That'll >> flood the target machine with small UDP packets. On that machine, I then >> run the 'xdp_redirect_map' program from samples/bpf. The bpf program >> used by that sample will update an internal counter for every packet, >> and the userspace prints it out, which gives you the performance (in >> PPS). So just modifying that sample to using your new multicast helper >> (and comparing it to regular REDIRECT to a single device) would be a >> first approximation of a performance test. > > Thanks for this method. I will update the sample and do some more tests. Great! >> You could do something like: >> >> bool first = true; >> for (;;) { >> >> [...] >> >> if (!first) { >> nxdpf = xdpf_clone(xdpf); >> if (unlikely(!nxdpf)) >> return -ENOMEM; >> bq_enqueue(dev, nxdpf, dev_rx); >> } else { >> bq_enqueue(dev, xdpf, dev_rx); >> first = false; >> } >> } >> >> /* didn't find anywhere to forward to, free buf */ >> if (first) >> xdp_return_frame_rx_napi(xdpf); > > I think the first xdpf will be consumed by the driver and the later > xdpf_clone() will failed, won't it? No, bq_enqueue just sticks the frame on a list, it's not consumed until after the NAPI cycle ends (and the driver calls xdp_do_flush()). > How about just do a xdp_return_frame_rx_napi(xdpf) after all nxdpf enqueue? Yeah, that would be the semantically obvious thing to do, but it is wasteful in that you end up performing one more clone than you strictly have to :) >> > @@ -3534,6 +3539,8 @@ int xdp_do_redirect(struct net_device *dev, struct >> > xdp_buff *xdp, >> > struct bpf_prog *xdp_prog) >> > { >> > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); >> > + bool exclude_ingress = !!(ri->flags & BPF_F_EXCLUDE_INGRESS); >> > + struct bpf_map *ex_map = READ_ONCE(ri->ex_map); >> >> I don't think you need the READ_ONCE here since there's already one >> below? > > BTW, I forgot to ask, why we don't need the READ_ONCE for ex_map? > I though the map and ex_map are two different pointers. It isn't, but not for the reason I thought, so I can understand why my comment might have been somewhat confusing (I have been confused by this myself until just now...). The READ_ONCE() is not needed because the ex_map field is only ever read from or written to by the CPU owning the per-cpu pointer. Whereas the 'map' field is manipulated by remote CPUs in bpf_clear_redirect_map(). So you need neither READ_ONCE() nor WRITE_ONCE() on ex_map, just like there are none on tgt_index and tgt_value. -Toke