On Tue, 2017-05-23 at 22:08 +0200, Greg Kroah-Hartman wrote: > 4.4-stable review patch. If anyone has any objections, please let me know. > > ------------------ > > From: Dennis Yang <dennisyang@xxxxxxxx> > > commit 583da48e388f472e8818d9bb60ef6a1d40ee9f9d upstream. > > When growing raid5 device on machine with small memory, there is chance that > mdadm will be killed and the following bug report can be observed. The same > bug could also be reproduced in linux-4.10.6. [...] > The problem is that resize_stripes() releases new stripe_heads before assigning new > slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called > after resize_stripes() starting releasing new stripes but right before new slab cache > being assigned, it is possible that these new stripe_heads will be freed with the old > slab_cache which was already been destoryed and that triggers this bug. [...] > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -2232,6 +2232,10 @@ static int resize_stripes(struct r5conf > err = -ENOMEM; > > mutex_unlock(&conf->cache_size_mutex); > + > + conf->slab_cache = sc; > + conf->active_name = 1-conf->active_name; > + > /* Step 4, return new stripes to service */ > while(!list_empty(&newstripes)) { > nsh = list_entry(newstripes.next, struct stripe_head, lru); [...] The assignments are still being done after conf->cache_size_mutex is unlocked, so there still seems to be a race with raid5_cache_scan(). Shouldn't they be moved above the mutex_unlock()? Ben. -- Ben Hutchings Software Developer, Codethink Ltd.