On Wed, 11 Nov 2015 16:28:20 +0100 Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > On Tue, 10 Nov 2015 21:32:46 +0300 > Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> wrote: > > > On Tue, Nov 10, 2015 at 04:55:34PM +0100, Jesper Dangaard Brouer wrote: > > > On Tue, 10 Nov 2015 11:46:33 +0300 > > > Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> wrote: > > > > > > > On Mon, Nov 09, 2015 at 09:25:22PM +0100, Jesper Dangaard Brouer wrote: > > > > > On Mon, 9 Nov 2015 22:13:35 +0300 > > > > > Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > On Mon, Nov 09, 2015 at 07:17:31PM +0100, Jesper Dangaard Brouer wrote: > > > > > > ... > > > > > > > @@ -2556,7 +2563,7 @@ redo: > > > > > > > if (unlikely(gfpflags & __GFP_ZERO) && object) > > > > > > > memset(object, 0, s->object_size); > > > > > > > > > > > > > > - slab_post_alloc_hook(s, gfpflags, object); > > > > > > > + slab_post_alloc_hook(s, gfpflags, 1, object); > > > > > > > > > > > > I think it must be &object > > > > > > > > > > The object is already a void ** type. > > > > > > > > Let's forget about types for a second. object contains an address to the > > > > newly allocated object, while slab_post_alloc_hook expects an array of > > > > addresses to objects. Simple test. Suppose an allocation failed. Then > > > > object equals 0. Passing 0 to slab_post_alloc_hook as @p and 1 as @size > > > > will result in NULL ptr dereference. > > > > > > Argh, that is not good :-( > > > I tested memory exhaustion and NULL ptr deref does happen in this case. > > > > > > BUG: unable to handle kernel NULL pointer dereference at (null) > > > IP: [<ffffffff8113dea2>] kmem_cache_alloc+0x92/0x1d0 > > > > > > (gdb) list *(kmem_cache_alloc)+0x92 > > > 0xffffffff8113dea2 is in kmem_cache_alloc (mm/slub.c:1302). > > > 1297 { > > > 1298 size_t i; > > > 1299 > > > 1300 flags &= gfp_allowed_mask; > > > 1301 for (i = 0; i < size; i++) { > > > 1302 void *object = p[i]; > > > 1303 > > > 1304 kmemcheck_slab_alloc(s, flags, object, slab_ksize(s)); > > > 1305 kmemleak_alloc_recursive(object, s->object_size, 1, > > > 1306 s->flags, flags); > > > (gdb) quit > > > > > > I changed: > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 2eab115e18c5..c5a62fd02321 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -2484,7 +2484,7 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, > > > static __always_inline void *slab_alloc_node(struct kmem_cache *s, > > > gfp_t gfpflags, int node, unsigned long addr) > > > { > > > - void **object; > > > + void *object; > > > struct kmem_cache_cpu *c; > > > struct page *page; > > > unsigned long tid; > > > @@ -2563,7 +2563,7 @@ redo: > > > if (unlikely(gfpflags & __GFP_ZERO) && object) > > > memset(object, 0, s->object_size); > > > > > > - slab_post_alloc_hook(s, gfpflags, 1, object); > > > + slab_post_alloc_hook(s, gfpflags, 1, &object); > > > > > > return object; > > > } > > > > > > But then the kernel cannot correctly boot?!?! (It dies in > > > x86_perf_event_update+0x15.) What did I miss??? > > > > Weird... I applied all your patches including the one above to > > v4.3-rc6-mmotm-2015-10-21-14-41 and everything boots and works just fine > > both inside a VM and on my x86 host. Are you sure the problem is caused > > by your patches? Perhaps you updated the source tree in the meantime. > > I didn't rebase, but I likely _should_ rebase my patchset. It could be > something different from my patch, I will investigate further. > > When you tested it, did you make sure the compiler didn't "remove" the > code inside the for loop? > > To put some code inside the for loop, I have enabled both > CONFIG_KMEMCHECK and CONFIG_DEBUG_KMEMLEAK, plus CONFIG_SLUB_DEBUG_ON=y > (but it seems SLUB_DEBUG gets somewhat removed when these gets enabled, > didn't check the details). Okay, there is nothing wrong with this change (it is actually more correct). The problem was related to CONFIG_KMEMCHECK. It was causing the system to not boot (I have not look into why yet, don't have full console output, but I can see it complains about PCI and ACPI init and then dies in x86_perf_event_update+0x15, thus it could be system/HW specific). I'm now running with CONFIG_DEBUG_KMEMLEAK, and is running tests with exhausting memory. And it works, e.g. when the alloc fails and @object becomes NULL. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>