On Mon, Jun 18, 2018 at 9:08 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > On Tue, Jun 19, 2018 at 5:59 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > Hi Jason, yes please do send me the test suite with the kernel config. > > $ git clone https://git.zx2c4.com/WireGuard > $ cd WireGuard/src > $ [[ $(gcc -v 2>&1) =~ gcc\ version\ 8\.1\.0 ]] || echo crash needs 8.1 > $ export DEBUG_KERNEL=yes > $ export KERNEL_VERSION=4.18-rc1 > $ make test-qemu -j$(nproc) > > This will build a kernel and a minimal userland and load it in qemu, > which must be installed. > > This code is what causes the crash: > The self test that's executed: > https://git.zx2c4.com/WireGuard/tree/src/selftest/ratelimiter.h > Which exercises this code: > https://git.zx2c4.com/WireGuard/tree/src/ratelimiter.c > > The problem occurs after gc_entries(NULL) frees things (line 124 in > ratelimiter.h above), and then line 133 reallocates those objects. > Sometime after that happens, elsewhere in the kernel invokes this > kasan issue in the kasan cache cleanup. > I will try to repro with your test suite sometime later this week. However from high level code inspection, I see that the code is creating a 'entry_cache' kmem_cache which is destroyed by ratelimiter_uninit on last reference drop. Currently refcnt in your code can underflow, through it does not seem like the selftest will cause the underflow but still you should fix it. >From high level your code seems fine. Does the issue occur on first try of selftest? Basically I wanted to ask if kmem_cache_destroy of your entry_cache was ever executed and have you tried to run this selftest multiple time while the system was up. As Dmitry already asked, are you using SLAB or SLUB? > I realize it's disappointing that the test case here is in WireGuard, > which isn't (yet!) upstream. That's why in my original message I > wrote: > > > Rather, it looks like this > > commit introduces a performance optimization, rather than a > > correctness fix, so it seems that whatever test case is failing is > > likely an incorrect failure. Does that seem like an accurate > > possibility to you? > > I was hoping to only point you toward my own code after establishing > the possibility that the bug is not my own. If you still think there's > a chance this is due to my own correctness issue, and your commit has > simply unearthed it, let me know and I'll happily keep debugging on my > own before pinging you further. > Sorry, I can not really give a definitive answer. Shakeel