Re: [patch 025/155] slub: relocate freelist pointer to middle of object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Wed, 01 Apr 2020, Andrew Morton wrote:

> From: Kees Cook <keescook@xxxxxxxxxxxx>
> Subject: slub: relocate freelist pointer to middle of object
> 
> In a recent discussion[1] with Vitaly Nikolenko and Silvio Cesare, it
> became clear that moving the freelist pointer away from the edge of
> allocations would likely improve the overall defensive posture of the
> inline freelist pointer.  My benchmarks show no meaningful change to
> performance (they seem to show it being faster), so this looks like a
> reasonable change to make.
> 
> Instead of having the freelist pointer at the very beginning of an
> allocation (offset 0) or at the very end of an allocation (effectively
> offset -sizeof(void *) from the next allocation), move it away from the
> edges of the allocation and into the middle.  This provides some
> protection against small-sized neighboring overflows (or underflows), for
> which the freelist pointer is commonly the target.  (Large or well
> controlled overwrites are much more likely to attack live object contents,
> instead of attempting freelist corruption.)
> 
> The vaunted kernel build benchmark, across 5 runs. Before:
> 
> 	Mean: 250.05
> 	Std Dev: 1.85
> 
> and after, which appears mysteriously faster:
> 
> 	Mean: 247.13
> 	Std Dev: 0.76
> 
> Attempts at running "sysbench --test=memory" show the change to be well in
> the noise (sysbench seems to be pretty unstable here -- it's not really
> measuring allocation).
> 
> Hackbench is more allocation-heavy, and while the std dev is above the
> difference, it looks like may manifest as an improvement as well:
> 
> 20 runs of "hackbench -g 20 -l 1000", before:
> 
> 	Mean: 36.322
> 	Std Dev: 0.577
> 
> and after:
> 
> 	Mean: 36.056
> 	Std Dev: 0.598
> 
> [1] https://twitter.com/vnik5287/status/1235113523098685440
> 
> Link: http://lkml.kernel.org/r/202003051624.AAAC9AECC@keescook
> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> Acked-by: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Vitaly Nikolenko <vnik@xxxxxxxxxxx>
> Cc: Silvio Cesare <silvio.cesare@xxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>Cc: Pekka Enberg <penberg@xxxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> 
>  mm/slub.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 

With kernel v5.7-rc1 I am unable to boot when using the SLUB allocator
and red zoning (slub_debug=Z), but otherwise a default config. Bisect
points to this patch, and when reverting it, the kernel boots again.

Splat:
	[...]
	[    0.328713] rcu: Hierarchical RCU implementation.
	[    0.329169] rcu:     RCU event tracing is enabled.
	[    0.329611] rcu:     RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=8.
	[    0.330251] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
	[    0.330984] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
	[    0.332130] NR_IRQS: 4352, nr_irqs: 488, preallocated irqs: 16
	[    0.332713] general protection fault, probably for non-canonical address 0xccccccccccccccd4: 0000 [#1] SMP PTI
	[    0.333680] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc1+ #3
	[    0.334280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
	[    0.335079] RIP: 0010:deactivate_slab.isra.0+0x5b/0x460
	[    0.335582] Code: 48 8b b4 c7 e0 00 00 00 49 8b 44 24 20 31 ff 48 85 c0 40 0f 95 c7 83 c7 0f 89 7c 24 18 48 85 d2 0f 84 a0 00 00 00 41 8b 4e 20 <48> 8b 3c 0b 48 85 ff 0f 84 8c 00 00 00 49 8b 54 24 28 48 89 04 0b
	[    0.337385] RSP: 0000:ffffffffb7e03c80 EFLAGS: 00010086
	[    0.337907] RAX: 0000000000000000 RBX: cccccccccccccccc RCX: 0000000000000008
	[    0.338688] RDX: cccccccccccccccc RSI: ffff91241c800f40 RDI: 000000000000000f
	[    0.339473] RBP: ffffffffb7e03d20 R08: ffff91241fc2d230 R09: 0000000000000000
	[    0.340256] R10: ffff91241c89c010 R11: 0000000000000000 R12: ffffcf2f20722700
	[    0.341041] R13: cccccccccccccccc R14: ffff91241c802180 R15: ffffcf2f20722700
	[    0.341833] FS:  0000000000000000(0000) GS:ffff91241fc00000(0000) knlGS:0000000000000000
	[    0.342727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[    0.343359] CR2: ffff911e74e01000 CR3: 000000027460a001 CR4: 00000000000606b0
	[    0.344146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	[    0.344929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
	[    0.345727] Call Trace:
	[    0.345999]  ? setup_object_debug.isra.0+0x1d/0x40
	[    0.346525]  ? new_slab+0x195/0x340
	[    0.346909]  ? init_object+0x2f/0x80
	[    0.347305]  ___slab_alloc+0x526/0x570
	[    0.347717]  ? kasprintf+0x4e/0x70
	[    0.348092]  ? init_object+0x2f/0x80
	[    0.348488]  ? string+0x42/0x50
	[    0.348834]  ? kasprintf+0x4e/0x70
	[    0.349217]  __kmalloc_track_caller+0x1d2/0x200
	[    0.349720]  kvasprintf+0x64/0xc0
	[    0.350085]  kasprintf+0x4e/0x70
	[    0.350442]  ? kmem_cache_alloc_trace+0x188/0x1b0
	[    0.350962]  __irq_domain_alloc_fwnode+0x8f/0xd0
	[    0.351474]  arch_early_irq_init+0x16/0x90
	[    0.351923]  start_kernel+0x2aa/0x4c2
	[    0.352325]  secondary_startup_64+0xb6/0xc0
	[    0.352784] Modules linked in:
	[    0.353124] random: get_random_bytes called from print_oops_end_marker+0x21/0x40 with crng_init=0
	[    0.353126] ---[ end trace 186486c23e10986d ]---
	[    0.354613] RIP: 0010:deactivate_slab.isra.0+0x5b/0x460
	[    0.355186] Code: 48 8b b4 c7 e0 00 00 00 49 8b 44 24 20 31 ff 48 85 c0 40 0f 95 c7 83 c7 0f 89 7c 24 18 48 85 d2 0f 84 a0 00 00 00 41 8b 4e 20 <48> 8b 3c 0b 48 85 ff 0f 84 8c 00 00 00 49 8b 54 24 28 48 89 04 0b
	[    0.357255] RSP: 0000:ffffffffb7e03c80 EFLAGS: 00010086
	[    0.357829] RAX: 0000000000000000 RBX: cccccccccccccccc RCX: 0000000000000008
	[    0.358613] RDX: cccccccccccccccc RSI: ffff91241c800f40 RDI: 000000000000000f
	[    0.359398] RBP: ffffffffb7e03d20 R08: ffff91241fc2d230 R09: 0000000000000000
	[    0.360181] R10: ffff91241c89c010 R11: 0000000000000000 R12: ffffcf2f20722700
	[    0.360965] R13: cccccccccccccccc R14: ffff91241c802180 R15: ffffcf2f20722700
	[    0.361755] FS:  0000000000000000(0000) GS:ffff91241fc00000(0000) knlGS:0000000000000000
	[    0.362645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[    0.363275] CR2: ffff911e74e01000 CR3: 000000027460a001 CR4: 00000000000606b0
	[    0.364060] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	[    0.364844] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
	[    0.365636] Kernel panic - not syncing: Attempted to kill the idle task!
	[    0.366393] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

Can you reproduce this? Let me know if you need more information.

Thanks,
-- Marco

> --- a/mm/slub.c~slub-relocate-freelist-pointer-to-middle-of-object
> +++ a/mm/slub.c
> @@ -3581,6 +3581,13 @@ static int calculate_sizes(struct kmem_c
>  		 */
>  		s->offset = size;
>  		size += sizeof(void *);
> +	} else if (size > sizeof(void *)) {
> +		/*
> +		 * Store freelist pointer near middle of object to keep
> +		 * it away from the edges of the object to avoid small
> +		 * sized over/underflows from neighboring allocations.
> +		 */
> +		s->offset = ALIGN(size / 2, sizeof(void *));
>  	}
>  
>  #ifdef CONFIG_SLUB_DEBUG
> _




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux