Hi everyone, I'm investigating some kmemleak reports that are related to the napi_get_frags_check() function. Similar issues have been reported before in [1] and [2], and the upper part of the stack trace, starting at gro_cells_init(), is identical in my case. I am pretty sure this is a kmemleak false-positive, which is not surprising, and I am approaching this from a different perspective - trying to understand how a false-positive in this particular case is even possible. So far, I have been unsuccessful. Like Eric Dumazet pointed out in his reply to [1], napi_get_frags_check() is very self- contained. It allocates an skb and then immediately frees it. I would appreciate if anyone could offer any insights or new ideas to try to explain this behavior. Again, this is not about fixing the networking code (because I believe there's nothing to fix there) but rather finding a solid explanation for how the kmemleak report is possible. That might lead to either direct (code) or indirect (usage) improvements to kmemleak. My understanding is that kmemleak immediately removes an object from its internal list of tracked objects upon deallocation of the object. It also has a built-in object age threshold of 5 seconds before it reports a leak, specifically to avoid false-positives when pointers to the allocated objects are in flight and/or temporarily stored in CPU registers. Since in this case the deallocation is done immediately after the allocation and it's unconditional, I can't even imagine how it can escape the object age guard check. For the record, this is the kmemleak report that I'm seeing: unreferenced object 0xffff4fc0425ede40 (size 240): comm "(ostnamed)", pid 25664, jiffies 4296402173 hex dump (first 32 bytes): e0 99 5f 27 c1 4f ff ff 40 c3 5e 42 c0 4f ff ff .._'.O..@.^B.O.. 00 c0 24 15 c0 4f ff ff 00 00 00 00 00 00 00 00 ..$..O.......... backtrace (crc 1f19ed80): [<ffffbc229bc23c04>] kmemleak_alloc+0xb4/0xc4 [<ffffbc229a16cfcc>] slab_post_alloc_hook+0xac/0x120 [<ffffbc229a172608>] kmem_cache_alloc_bulk+0x158/0x1a0 [<ffffbc229b645e18>] napi_skb_cache_get+0xe8/0x160 [<ffffbc229b64af64>] __napi_build_skb+0x24/0x60 [<ffffbc229b650240>] napi_alloc_skb+0x17c/0x2dc [<ffffbc229b76c65c>] napi_get_frags+0x5c/0xb0 [<ffffbc229b65b3e8>] napi_get_frags_check+0x38/0xb0 [<ffffbc229b697794>] netif_napi_add_weight+0x4f0/0x84c [<ffffbc229b7d2704>] gro_cells_init+0x1a4/0x2d0 [<ffffbc2250d8553c>] ip_tunnel_init+0x19c/0x660 [ip_tunnel] [<ffffbc2250e020c0>] ipip_tunnel_init+0xe0/0x110 [ipip] [<ffffbc229b6c5480>] register_netdevice+0x440/0xea4 [<ffffbc2250d846b0>] __ip_tunnel_create+0x280/0x444 [ip_tunnel] [<ffffbc2250d88978>] ip_tunnel_init_net+0x264/0x42c [ip_tunnel] [<ffffbc2250e02150>] ipip_init_net+0x30/0x40 [ipip] The obvious test, which I already did, is to create/delete ip tunnel interfaces in a loop. I let this test run for more than 24 hours, and kmemleak did *not* detect anything. I also attached a kprobe inside napi_skb_cache_get() right after the call to kmem_cache_alloc_bulk(), and successfully verified that the allocation path is indeed exercised by the test i.e., the skb is *not* always returned from the per-cpu napi cache pool. In other words, I was unable to find a way to reproduce these kmemleak reports. It is worth noting that in the case of a "manually" created tunnel using `ip tunnel add ... mode ipip ...`, the lower part of the stack is different from the kmemleak report (see below). But I don't think this can affect the skb allocation or pointer handling behavior, and the upper part of the stack, starting at register_netdevice(), is identical anyway. comm: [ip], pid: 101422 ip_tunnel_init+0 register_netdevice+1088 __ip_tunnel_create+640 ip_tunnel_ctl+956 ipip_tunnel_ctl+380 ip_tunnel_siocdevprivate+212 dev_ifsioc+1096 dev_ioctl+348 sock_ioctl+1760 __arm64_sys_ioctl+288 invoke_syscall.constprop.0+216 do_el0_svc+344 el0_svc+84 el0t_64_sync_handler+308 el0t_64_sync+380 Another thing I did consider is whether kmemleak is likely to be confused by per-cpu allocations, since gcells->cells is per-cpu allocated in gro_cells_init(). I created a simple test kernel module that did a similar per-cpu allocation, and I did *not* notice any problem with kmemleak being able to track dynamically allocated blocks that are referenced through per-cpu pointers. One final note is that the reports in [1] seem to have been observed on x86_64 (judging by the presence of entry_SYSCALL_64_after_hwframe in the stack trace), while mine were observed on aarch64. So, whatever the root cause behind these kmemleak reports is, it seems to be architecture independent. Thanks in advance, Radu Rendec [1] https://lore.kernel.org/all/YwkH9zTmLRvDHHbP@krava/ [2] https://lore.kernel.org/all/1667213123-18922-1-git-send-email-wangyufen@xxxxxxxxxx/