On Wed, 21 Mar 2018, Christopher Lameter wrote: > On Wed, 21 Mar 2018, Mikulas Patocka wrote: > > > For example, if someone creates a slab cache with the flag SLAB_CACHE_DMA, > > and he allocates an object from this cache and this allocation races with > > the user writing to /sys/kernel/slab/cache/order - then the allocator can > > for a small period of time see "s->allocflags == 0" and allocate a non-DMA > > page. That is a bug. > > True we need to fix that: > > Subject: Avoid potentially visible allocflags without all flags set > > During slab size recalculation s->allocflags may be temporarily set > to 0 and thus the flags may not be set which may result in the wrong > flags being passed. Slab size calculation happens in two cases: > > 1. When a slab is created (which is safe since we cannot have > concurrent allocations) > > 2. When the slab order is changed via /sysfs. > > Signed-off-by: Christoph Lameter <cl@xxxxxxxxx> > > > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -3457,6 +3457,7 @@ static void set_cpu_partial(struct kmem_ > static int calculate_sizes(struct kmem_cache *s, int forced_order) > { > slab_flags_t flags = s->flags; > + gfp_t allocflags; > size_t size = s->object_size; > int order; > > @@ -3551,16 +3552,17 @@ static int calculate_sizes(struct kmem_c > if (order < 0) > return 0; > > - s->allocflags = 0; > + allocflags = 0; > if (order) > - s->allocflags |= __GFP_COMP; > + allocflags |= __GFP_COMP; > > if (s->flags & SLAB_CACHE_DMA) > - s->allocflags |= GFP_DMA; > + allocflags |= GFP_DMA; > > if (s->flags & SLAB_RECLAIM_ACCOUNT) > - s->allocflags |= __GFP_RECLAIMABLE; > + allocflags |= __GFP_RECLAIMABLE; > > + s->allocflags = allocflags; I'd also use "WRITE_ONCE(s->allocflags, allocflags)" here and when writing s->oo and s->min to avoid some possible compiler misoptimizations. WRITE_ONCE should be used when writing a value that can be read simultaneously (but a lot of kernel code misses it). Another problem is that it updates s->oo and later it updates s->max: s->oo = oo_make(order, size, s->reserved); s->min = oo_make(get_order(size), size, s->reserved); if (oo_objects(s->oo) > oo_objects(s->max)) s->max = s->oo; --- so, the concurrently running code could see s->oo > s->max, which could trigger some memory corruption. s->max is only used in memory allocations - kmalloc(BITS_TO_LONGS(oo_objects(s->max)) * sizeof(unsigned long)), so perhaps we could fix the bug by removing s->max at all and always allocating enough memory for the maximum possible number of objects? - kmalloc(BITS_TO_LONGS(oo_objects(s->max)) * sizeof(unsigned long), GFP_KERNEL); + kmalloc(BITS_TO_LONGS(MAX_OBJS_PER_PAGE) * sizeof(unsigned long), GFP_KERNEL); Mikulas > /* > * Determine the number of objects per slab > */ >