On mån, 2014-01-20 at 14:37 +1100, NeilBrown wrote: > > Thanks - that extra info is quite useful. Knowing that nothing else unusual > is happening can be quite valuable (and I don't like to assume). > > I haven't found anything that would clearly cause your crash, but I have > found something that looks wrong and conceivably could. > > Could you please try this patch on top of what you are currently using? By > the look of it you get a crash at least every day, often more often. So if > this produces a day with no crashes, that would be promising. > > The important aspect of the patch is that it moves the "atomic_inc" of > "sh->count" back under the protection of ->device_lock in the case when some > other thread might be using the same 'sh'. I have been unable to trip this up, so this was it! Tested-by: Ian Kumlien <ian.kumlien@xxxxxxxxx> I hope this hits stable ASAP ;) > Thanks, > NeilBrown > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 3088d3af5a89..03f82ab87d9e 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -675,8 +675,10 @@ get_active_stripe(struct r5conf *conf, sector_t sector, > || !conf->inactive_blocked), > *(conf->hash_locks + hash)); > conf->inactive_blocked = 0; > - } else > + } else { > init_stripe(sh, sector, previous); > + atomic_inc(&sh->count); > + } > } else { > spin_lock(&conf->device_lock); > if (atomic_read(&sh->count)) { > @@ -695,13 +697,11 @@ get_active_stripe(struct r5conf *conf, sector_t sector, > sh->group = NULL; > } > } > + atomic_inc(&sh->count); > spin_unlock(&conf->device_lock); > } > } while (sh == NULL); > > - if (sh) > - atomic_inc(&sh->count); > - > spin_unlock_irq(conf->hash_locks + hash); > return sh; > } -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html