On Mon, 18 Oct 2021 19:40:40 +0200 Toke Høiland-Jørgensen wrote: > Jakub Kicinski <kuba@xxxxxxxxxx> writes: > > > On Mon, 18 Oct 2021 17:04:19 +0300 Vlad Buslov wrote: > >> We got a use-after-free with very similar trace [0] during nightly > >> regression. The issue happens when ip link up/down state is flipped > >> several times in loop and doesn't reproduce for me manually. The fact > >> that it didn't reproduce for me after running test ten times suggests > >> that it is either very hard to reproduce or that it is a result of some > >> interaction between several tests in our suite. > >> > >> [0]: > >> > >> [ 3187.779569] mlx5_core 0000:08:00.0 enp8s0f0: Link up > >> [ 3187.890694] ================================================================== > >> [ 3187.892518] BUG: KASAN: use-after-free in __list_add_valid+0xc3/0xf0 > >> [ 3187.894132] Read of size 8 at addr ffff8881150b3fb8 by task ip/119618 > > > > Hm, not sure how similar it is. This one looks like channel was freed > > without deleting NAPI. Do you have list debug enabled? > > Well, the other report[0] also kinda looks like the NAPI thread keeps > running after it should have been disabled, so maybe they are in fact > related? > > [0] https://lore.kernel.org/r/000000000000c1524005cdeacc5f@xxxxxxxxxx Could be, if napi->state gets corrupted it may lose NAPI_STATE_LISTED. 719c57197010 ("net: make napi_disable() symmetric with enable") 3765996e4f0b ("napi: fix race inside napi_enable") is the only thing that comes to mind, but they look fine to me.