Re: [PATCH net] bondig: Add bond_xdp_check for bond_xdp_xmit in bond_main.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 16, 2024 at 5:48 PM Nikolay Aleksandrov <razor@xxxxxxxxxxxxx> wrote:
>
> On 16/09/2024 08:50, Jiwon Kim wrote:
> > Add bond_xdp_check to ensure the bond interface is in a valid state.
> >
> > syzbot reported WARNING in bond_xdp_get_xmit_slave.
> > In bond_xdp_get_xmit_slave, the comment says
> > /* Should never happen. Mode guarded by bond_xdp_check() */.
> > However, it does not check the status when entering bond_xdp_xmit.
> >
> > Reported-by: syzbot+c187823a52ed505b2257@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://syzkaller.appspot.com/bug?extid=c187823a52ed505b2257
> > Fixes: 9e2ee5c7e7c3 ("net, bonding: Add XDP support to the bonding driver")
> > Signed-off-by: Jiwon Kim <jiwonaid0@xxxxxxxxx>
> > ---
> >  drivers/net/bonding/bond_main.c | 33 ++++++++++++++++++---------------
> >  1 file changed, 18 insertions(+), 15 deletions(-)
> >
>
> How did you figure the problem is there? Did you take any time to actually
> understand it? This patch doesn't fix anything, the warning can be easily
> triggered with it. The actual fix is to remove that WARN_ON() altogether
> and downgrade the netdev_err() to a ratelimited version. The reason is that
> we can always get to a state where at least 1 bond device has xdp program
> installed which increases bpf_master_redirect_enabled_key and another bond
> device which uses xdpgeneric, then install an ebpf program that simply
> returns ACT_TX on xdpgeneric bond's slave and voila - you get the warning.
>
> setup is[1]:
>  $ ip l add veth0 type veth peer veth1
>  $ ip l add veth3 type veth peer veth4
>  $ ip l add bond0 type bond mode 6 # <- transmit-alb mode, unsupported by xdp
>  $ ip l add bond1 type bond # <- rr mode by default, supported by xdp
>  $ ip l set veth0 master bond1
>  $ ip l set bond1 up
>  $ ip l set dev bond1 xdpdrv object tx_xdp.o section xdp_tx # <- we need xdpdrv program to increase the static key, more below
>  $ ip l set veth3 master bond0
>  $ ip l set bond0 up
>  $ ip l set veth4 up
>  $ ip l set veth3 xdpgeneric object tx_xdp.o section xdp_tx # <- now we'll hit the codepath we need after veth3 Rx's a packet
>
>
> If you take the time to look at the call stack and the actual code, you'll
> see it goes something like (for the xdpgeneric bond slave, veth3):
> ...
> bpf_prog_run_generic_xdp() for veth3
>  -> bpf_prog_run_xdp()
>    -> __bpf_prog_run() # return ACT_TX
>      -> xdp_master_redirect() # called because we have ACT_TX && netif_is_bond_slave(xdp->rxq->dev)
>        -> master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp); # and here we go, WARN_ON()
>
> I've had a patch for awhile now about this and have taken the time to look into it.
> I guess it's time to dust it off and send it out for review. :)
>
> Thanks,
>  Nik

Hi Nikolay,

Thank you for taking the time to provide a detailed setup and call
stack analysis.
Would you be handling the new patch? If you don't mind, may I revise
this patch to

- Replace with net_ratelimit()
- Remove the WARN_ON()
- Update the comment appropriately

Thanks again for your insights and patience.

Sincerely,

Jiwon Kim





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux