Hi Nik, On 7/1/19 7:03 PM, Nikolay Aleksandrov wrote: > Hi Martin, > > On 01/07/2019 19:53, Martin Weinelt wrote: >> Hi Nik, >> >> more info below. >> >> On 6/29/19 3:11 PM, nikolay@xxxxxxxxxxxxxxxxxxx wrote: >>> On 29 June 2019 14:54:44 EEST, Martin Weinelt <martin@xxxxxxxxxxxxxxx> wrote: >>>> Hello, >>>> >>>> we've recently been experiencing memory leaks on our Linux-based >>>> routers, >>>> at least as far back as v4.19.16. >>>> >>>> After rebuilding with KASAN it found a use-after-free in >>>> br_multicast_rcv which I could reproduce on v5.2.0-rc6. >>>> >>>> Please find the KASAN report below, I'm anot sure what else to provide >>>> so >>>> feel free to ask. >>>> >>>> Best, >>>> Martin >>>> >>>> >>> >>> Hi Martin, >>> I'll look into this, are there any specific steps to reproduce it? >>> >>> Thanks, >>> Nik >>>> >> Each server is a KVM Guest and has 18 bridges with the same master/slave >> relationships: >> >> bridge -> batman-adv -> {l2 tunnel, virtio device} >> >> Linus Lüssing from the batman-adv asked me to apply this patch to help >> debugging. >> >> v5.2-rc6-170-g728254541ebc with this patch yielded the following KASAN >> report, not sure if the additional information at the end is a result of >> the added patch though. >> >> Best, >> Martin >> > > I see a couple of issues that can cause out-of-bounds accesses in br_multicast.c > more specifically there're pskb_may_pull calls and accesses to stale skb pointers. > I've had these on my "to fix" list for some time now, will prepare, test the fixes and > send them for review. In a few minutes I'll send a test patch for you. > That being said, I thought you said you've been experiencing memory leaks, but below > reports are for out-of-bounds accesses, could you please clarify if you were > speaking about these or is there another issue as well ? > If you're experiencing memory leaks, are you sure they're related to the bridge ? > You could try kmemleak for those. > > Thank you, > Nik > we had been experiencing memory leaks on v4.19.37, thats why we started to turn on KASAN and kmemleak in the first place. This is when we found this use-after-free. The memory leak exists, and is a separate issue. Apparently kmemleak does not work, I suspect the early log size is too small root@gw02:~# echo scan > /sys/kernel/debug/kmemleak -bash: echo: write error: Device or resource busy CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400 # CONFIG_DEBUG_KMEMLEAK_TEST is not set # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y I'll increase the early log size with the next build to try and get more information on the memory leak, I'll open a separate thread for that then. Thanks, Martin