> -----Original Message----- > From: Williams, Dan J [mailto:dan.j.williams@xxxxxxxxx] > Sent: Tuesday, September 06, 2011 9:10 PM > To: Kwolek, Adam > Cc: neilb@xxxxxxx; linux-raid@xxxxxxxxxxxxxxx; Ciechanowski, Ed; > Neubauer, Wojciech > Subject: Re: [PATCH] Fix: Sometimes mdmon throws core dump during > reshape > > On Mon, Sep 5, 2011 at 3:39 AM, Adam Kwolek <adam.kwolek@xxxxxxxxx> > wrote: > > Problem was found during reshaping 2 volumes /raid0 and raid5/ in > container. > > Sometimes mdmon throws core dump due to NULL pointer exception. > > > > Problem occurs in scenario: > > - managemon: is about spare activation (degraded raid4 volume == raid0 > under takeover) > > - managemon: detect level change and signals monitor (manage_member() > calls replace_array()) > > - monitor: detects transition raid4/5->raid0 and sets a->container to > NULL > > to indicate array deactivation > > Maybe I have lost track of the reshape implementation but I don't see > where the monitor sets ->container to NULL during a reshape? Do you > mean deactivate mdmon for the array after the reshape completes? > > > - managemon : continues his work and tries to activate spare (a- > >check_degraded is set). > > NULL pointer is passed to metadata handler > activate_spare() > > Core dump is generated. > > > > To resolve this situation managemon (after monitor kick) checks again > > a->container pointer to learn if current array is not to be > deactivated. Yes, when takeover is used. From one hand mdmon tries to resolve takeovered raid0 degradation "problem" and backward takeover occurs meanwhile. BR Adam > [..] > > diff --git a/managemon.c b/managemon.c > > index d020f82..3540dac 100644 > > --- a/managemon.c > > +++ b/managemon.c > > @@ -475,6 +475,12 @@ static void manage_member(struct mdstat_ent > *mdstat, > > } > > } > > > > + /* we are after monitor kick, > > + * so container field can be cleared - check it again > > + */ > > + if (a->container == NULL) > > + return; > > + > > Isn't this still racy? Because we don't wait for the monitor to run > before proceeding. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html