Re: Restarting AutoFS when a direct mount (with offsets) is busy

Leonardo Chiquitto <leonardo.lists@xxxxxxxxx> · Tue, 29 May 2012 12:57:47 -0400

On Tue, May 29, 2012 at 12:10 AM, Ian Kent <raven@xxxxxxxxxx> wrote:
> On Mon, 2012-01-16 at 16:56 -0300, Leonardo Chiquitto wrote:
>> On Thu, Dec 15, 2011 at 12:46 PM, Ian Kent <raven@xxxxxxxxxx> wrote:
>> > On Thu, 2011-12-15 at 23:43 +0800, Ian Kent wrote:
>> >> >
>> >> > So -- if I understood correctly the case you wanted to test -- the answer
>> >> > is yes: "s1" is unmounted when "v1" is in use.
>> >>
>> >> OK, thanks, I'll check this out when I get to it.
>> >
>> > Although that might actually be the correct behavior for deeper levels
>> > of nesting. I've suspected that the handling of nesting may have been
>> > broken at some point for a while but hadn't had cause to look closely at
>> > it.
>>
>> Hello Ian,
>
> First let me apologize for not finding time to get to this sooner.
>
> TBH I thought it was going to be quite difficult and so let it slip
> because I had some other time critical things that needed to be done.
>
> But the analysis here is great and I believe it points at a fairly
> simple solution.
>
>>
>> I've returned to this issue today, to investigate why the mount triggers
>> on busy containers are being unmounted when AutoFS is stopped. I've
>> found the following code in umount_multi_triggers():
>>
>> 1560         /*
>> 1561          * Special case.
>> 1562          * If we can't umount the root container then we can't
>> 1563          * delete the offsets from the cache and we need to put
>> 1564          * the offset triggers back.
>> 1565          */
>> 1566         if (is_mounted(_PATH_MOUNTED, root, MNTS_REAL)) {
>> 1567             info(ap->logopt, "unmounting dir = %s", root);
>> 1568             if (umount_ent(ap, root)) {
>> 1569                 if (mount_multi_triggers(ap, me, root,
>> strlen(root), "/") < 0)
>> 1570                     warn(ap->logopt,
>> 1571                          "failed to remount offset triggers");
>> 1572                 return left++;
>> 1573             }
>> 1574         }
>>
>> So the automounter really tries to recreate the triggers, but this
>> call to mount_multi_triggers() in line 1569 is failing. As far as I can
>> see, the shut down sequence when a busy container exists is:
>
> Yes, it does try and re-mount the offsets so that automount can
> re-connect to them at startup and as you have seen they need to be
> present for the re-construction.
>
>>
>> - <state changes to ST_SHUTDOWN>
>> - umount_autofs()
>> - umount_autofs_direct() => closes ap->kpipefd and sets it to -1
>> - do_umount_autofs_direct()
>> - umount_multi()
>> - umount_subtree_mounts()
>> - umount_multi_triggers() => successfully umounts all offsets (/nfs/{v1,v2})
>>     but fails to unmount /nfs because it is busy. Calls mount_multi_triggers()
>>     to recreate the offsets.
>> - mount_multi_triggers()
>>
>> Here are the (instrumented) logs of the attempt to create the /nfs/v1 trigger:
>>
>> Jan 16 16:28:38 n1 automount[14190]: mount_multi_triggers root: /nfs
>> start: 4 base: /
>> Jan 16 16:28:38 n1 automount[14190]: calling cache_get_offset()
>> Jan 16 16:28:38 n1 automount[14190]: mount_multi_triggers: mount
>> offset /nfs/v1 at /nfs
>> Jan 16 16:28:38 n1 automount[14190]: calling mount_autofs_offset()
>> Jan 16 16:28:38 n1 automount[14190]: entering mount_autofs_offset()
>> Jan 16 16:28:38 n1 automount[14190]: ops->version: 1317877376,
>> ap->flags: 1, ap->kpipefd: -1
>> Jan 16 16:28:38 n1 automount[14190]: mountpoint: /nfs/v1 options:
>> fd=-1,pgrp=14190,minproto=5,maxproto=5,offset
>> Jan 16 16:28:38 n1 automount[14190]: calling mkdir_path(/nfs/v1)
>> Jan 16 16:28:38 n1 automount[14190]: errno: 17
>> Jan 16 16:28:38 n1 automount[14190]: mount_autofs_offset: calling
>> mount -t autofs -s  -o fd=-1,pgrp=14190,minproto=5,maxproto=5,offset
>> automount /nfs/v1
>> Jan 16 16:28:38 n1 kernel: [22093.824826] autofs: called with bogus options
>> Jan 16 16:28:38 n1 automount[14190]: mount_autofs_offset: failed to
>> mount offset trigger /nfs/v1 at /nfs/v1
>> Jan 16 16:28:38 n1 automount[14190]: failed to mount offset
>>
>> The important error here is from the kernel: "autofs: called with
>> bogus options".
>> The bogus option it's complaining about is "fd=-1". So it looks like we
>> closed the kpipefd too soon, and it's needed to recreate the mount triggers.
>
> Yep, and that's all it is, the kernel pipe is closed too soon.
>
> I think it's better to just leave the close until later.
> This patch should do the trick, can you test it please.
>
>
> autofs-5.0.6 - fix umount recovery of busy direct mount
>
> From: Ian Kent <raven@xxxxxxxxxx>
>
> Reported by Leonardo Chiquitto (along with a problem analysis that lead
> to the resolution). Thanks for the effort Leonardo.
>
> When umounting direct mounts at exit, if any are busy and contain offset
> trigger mounts automount will try and re-mount them when the umount fails
> so they can be used to re-construct the mount tree at restart. But this
> fails because the kernel communication pipe, which is used as a parameter
> when mounting the offsets, has already been closed. To fix this all we
> need do is delay closing the kernel pipe file handle until after the
> direct mounts have been umounted since this doesn't affect the in use
> status of the mounts.
> ---
>
>  daemon/direct.c |   18 +++++++++---------
>  1 files changed, 9 insertions(+), 9 deletions(-)
>
>
> diff --git a/daemon/direct.c b/daemon/direct.c
> index b4c6d16..fd43b9a 100644
> --- a/daemon/direct.c
> +++ b/daemon/direct.c
> @@ -193,15 +193,6 @@ int umount_autofs_direct(struct autofs_point *ap)
>        struct mnt_list *mnts;
>        struct mapent *me, *ne;
>
> -       close(ap->state_pipe[0]);
> -       close(ap->state_pipe[1]);
> -       if (ap->pipefd >= 0)
> -               close(ap->pipefd);
> -       if (ap->kpipefd >= 0) {
> -               close(ap->kpipefd);
> -               ap->kpipefd = -1;
> -       }
> -
>        mnts = tree_make_mnt_tree(_PROC_MOUNTS, "/");
>        pthread_cleanup_push(mnts_cleanup, mnts);
>        nc = ap->entry->master->nc;
> @@ -231,6 +222,15 @@ int umount_autofs_direct(struct autofs_point *ap)
>        pthread_cleanup_pop(1);
>        pthread_cleanup_pop(1);
>
> +       close(ap->state_pipe[0]);
> +       close(ap->state_pipe[1]);
> +       if (ap->pipefd >= 0)
> +               close(ap->pipefd);
> +       if (ap->kpipefd >= 0) {
> +               close(ap->kpipefd);
> +               ap->kpipefd = -1;
> +       }
> +
>        return 0;
>  }
>

Hello Ian,

I tested the patch here and it works perfectly. Thanks a lot!

Leonardo
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html