Re: Triggering multiple expiration timers / alarms

Ian Kent <raven@xxxxxxxxxx> · Tue, 15 Jan 2013 09:10:46 +0800

On Fri, 2013-01-04 at 15:24 -0200, Leonardo Chiquitto wrote:
> On Tue, Nov 6, 2012 at 11:32 AM, Leonardo Chiquitto
> <leonardo.lists@xxxxxxxxx> wrote:
> > On Wed, Oct 10, 2012 at 5:29 AM, Ian Kent <raven@xxxxxxxxxx> wrote:
> >> On Thu, 2012-10-04 at 22:42 -0300, Leonardo Chiquitto wrote:
> >>> Hi,
> >>>
> >>> From time to time I stumble with an instance of the automounter logging
> >>> more messages than it normally does. These messages are the result of
> >>> the expiration thread running and should appear every timeout/4 seconds
> >>> once for each automounted path:
> >>>
> >>> automount[8430]: st_expire: state 1 path /s
> >>> automount[8430]: expire_proc: exp_proc = 140472501561088 path /s
> >>> automount[8430]: expire_cleanup: got thid 140472501561088 path /s stat 0
> >>> automount[8430]: expire_cleanup: sigchld: exp 140472501561088
> >>> finished, switching from 2 to 1
> >>> automount[8430]: st_ready: st_ready(): state = 2 path /s
> >>>
> >>> In the cases I've found, the frequency of execution of the expiration thread
> >>> is much higher than usual. I've debugged this today and discovered that
> >>> every time a SIGHUP is sent to the daemon, a new timeout alarm is added.
> >>> This alarm will then run forever every timeout/4 seconds. If you send many
> >>> SIGHUPs you'll notice that soon enough the automounter will be running
> >>> the expiration thread all the time.
> >>>
> >>> The code that adds this alarm is in do_readmap_cleanup():
> >>>
> >>>  323 static void do_readmap_cleanup(void *arg)
> >>>  324 {
> >>>  325         struct readmap_args *ra;
> >>>  326         struct autofs_point *ap;
> >>>  327
> >>>  328         ra = (struct readmap_args *) arg;
> >>>  329
> >>>  330         ap = ra->ap;
> >>>  331
> >>>  332         st_mutex_lock();
> >>>  333
> >>>  334         ap->readmap_thread = 0;
> >>>  335         st_set_done(ap);
> >>>  336         if (!ap->submount)
> >>>  337                 alarm_add(ap, ap->exp_runfreq);    <= here
> >>>  338         st_ready(ap);
> >>>  339
> >>>  340         st_mutex_unlock();
> >>>  341
> >>>  342         free(ra);
> >>>  343
> >>>  344         return;
> >>>  345 }
> >>>
> >>> Besides the SIGHUP case, this problem can also be triggered by adding a
> >>> new key to a file map and then trying to mount it. The lookup failure will
> >>> cause a map re-read and then a new alarm will be added.
> >>>
> >>> This part of the code is very old and I couldn't figure out why it needs to
> >>> add the alarm at that point. Ian, could you help?
> >>
> >> But nevertheless there should be an alarm delete somewhere in there but
> >> there isn't and the same applies to the prune (SIGUSR1) at least.
> >
> > Ian,
> >
> > What about the patch below?
> >
> > Leonardo
> >
> > Don't schedule new alarms after handling SIGHUP and SIGUSR1
> >
> > Currently, a new alarm is scheduled every time the daemon receives
> > a SIGHUP (map re-read) or SIGUSR1 (forced expiration). Besides that,
> > map re-reads started on demand when a map is found to be outdated
> > also generate a new alarm.
> >
> > Once added, these alarms are never deleted and hence increase the
> > number of times the daemon wakes up to run the expiration procedure.
> > After a couple of months, in setups with many mount points, it's
> > normal to see automount waking up every second to handle the
> > expiration timer.
> >
> > This patch removes the alarm scheduling from the readmap cleanup
> > routine and makes sure the alarm is re-added after the expiration
> > process only when it was not triggered by SIGUSR1.
> >
> > I couldn't think of any use case to justify keeping these alarms:
> > it's critical to have the alarm ticking every timeout/4 seconds,
> > but more than one periodic alarm running doesn't seem to make
> > sense.
> >
> > Index: autofs-5.0.7/daemon/state.c
> > ===================================================================
> > --- autofs-5.0.7.orig/daemon/state.c
> > +++ autofs-5.0.7/daemon/state.c
> > @@ -144,7 +144,7 @@ void expire_cleanup(void *arg)
> >                                         ap->submount = 2;
> >                         }
> >
> > -                       if (!ap->submount)
> > +                       if (ap->state == ST_EXPIRE && !ap->submount)
> >                                 alarm_add(ap, ap->exp_runfreq);
> >
> >                         /* FALLTHROUGH */
> > @@ -326,17 +326,12 @@ static void do_readmap_cleanup(void *arg
> >         struct autofs_point *ap;
> >
> >         ra = (struct readmap_args *) arg;
> > -
> >         ap = ra->ap;
> >
> >         st_mutex_lock();
> > -
> >         ap->readmap_thread = 0;
> >         st_set_done(ap);
> > -       if (!ap->submount)
> > -               alarm_add(ap, ap->exp_runfreq);
> >         st_ready(ap);
> > -
> >         st_mutex_unlock();
> >
> >         free(ra);
> > Index: autofs-5.0.7/CHANGELOG
> > ===================================================================
> > --- autofs-5.0.7.orig/CHANGELOG
> > +++ autofs-5.0.7/CHANGELOG
> > @@ -15,6 +15,7 @@
> >  - fix recursive mount deadlock.
> >  - increase file map read buffer size.
> >  - handle new location of systemd.
> > +- don't reschedule alarm after signals.
> >
> >  25/07/2012 autofs-5.0.7
> >  =======================
> 
> Hello Ian,
> 
> Any comment on this patch?

Thought about this again and came to the same conclusion.

On one hand I don't think we want an asynchronous expires during the
readmap but on the other hand the expire might block for a lengthy time
so deleting them for the duration of the readmap probably isn't the way
to go either.

Maybe, for now, it is best to just fix the problem of the extra expires
and worry about the above later.

I'll add this patch to the queue and have a think about it before I
commit it and push it.

Ian

--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html