On Mon, Apr 26, 2021 at 06:47:17PM +0800, Hangbin Liu wrote: > On Mon, Apr 26, 2021 at 11:53:50AM +0200, Jesper Dangaard Brouer wrote: > > On Fri, 23 Apr 2021 10:00:17 +0800 > > Hangbin Liu <liuhangbin@xxxxxxxxx> wrote: > > > > > This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to > > > extend xdp_redirect_map for broadcast support. > > > > > > With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces > > > in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be > > > excluded when do broadcasting. > > > > > > When getting the devices in dev hash map via dev_map_hash_get_next_key(), > > > there is a possibility that we fall back to the first key when a device > > > was removed. This will duplicate packets on some interfaces. So just walk > > > the whole buckets to avoid this issue. For dev array map, we also walk the > > > whole map to find valid interfaces. > > > > > > Function bpf_clear_redirect_map() was removed in > > > commit ee75aef23afe ("bpf, xdp: Restructure redirect actions"). > > > Add it back as we need to use ri->map again. > > > > > > Here is the performance result by using 10Gb i40e NIC, do XDP_DROP on > > > veth peer, run xdp_redirect_{map, map_multi} in sample/bpf and send pkts > > > via pktgen cmd: > > > ./pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -t 10 -s 64 > > > > While running: > > $ sudo ./xdp_redirect_map_multi -F i40e2 i40e2 > > Get interfaces 7 7 > > libbpf: elf: skipping unrecognized data section(23) .eh_frame > > libbpf: elf: skipping relo section(24) .rel.eh_frame for section(23) .eh_frame > > Forwarding 10140845 pkt/s > > Forwarding 11767042 pkt/s > > Forwarding 11783437 pkt/s > > Forwarding 11767331 pkt/s > > > > When starting: sudo ./xdp_monitor --stats > > That seems the same issue I reported previously in our meeting. > https://bugzilla.redhat.com/show_bug.cgi?id=1906820#c4 > > I only saw it 3 times and can't reproduce it easily. > > Do you have any idea where is the root cause? OK, I just re-did the test and could reproduce it now. Maybe because the code changed and it's easy to reproduce now. I will check this issue. Thanks Hangbin