Re: File descriptor leak when reloading the daemon

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2012-08-21 at 00:03 -0300, Leonardo Chiquitto wrote:
> On Wed, Aug 15, 2012 at 9:46 PM, Ian Kent <raven@xxxxxxxxxx> wrote:
> > On Tue, 2012-08-14 at 22:28 -0300, Leonardo Chiquitto wrote:
> >> Hello,
> >>
> >> A customer reported that AutoFS may leak file descriptors when some
> >> maps are modified and the daemon reloaded. I'm able to reproduce the
> >> problem on 5.0.7 by following these steps:
> >>
> >> 1. Configure a simple direct mount:
> >>
> >> # cat /etc/auto.master
> >> /-    /etc/auto.direct
> >>
> >> # cat /etc/auto.direct
> >> /nfs   server:/nfs
> >>
> >> 2. Start the automounter and do NOT trigger the mount
> >>
> >> 3. Replace /etc/auto.direct with:
> >>
> >> # cat /etc/auto.direct
> >> /nfs/1  server:/nfs
> >> /nfs/2  server:/nfs
> >>
> >> 4. Reload:
> >>
> >> # kill -HUP $(pidof automount)
> >>
> >> From now on, every reload will leak a file descriptor:
> >>
> >> # ls -la /proc/$(pidof automount)/fd | grep /nfs
> >> lr-x------ 1 root root 64 Aug 14 22:08 11 -> /nfs
> >> lr-x------ 1 root root 64 Aug 14 22:08 12 -> /nfs
> >> lr-x------ 1 root root 64 Aug 14 22:08 13 -> /nfs
> >> lr-x------ 1 root root 64 Aug 14 22:08 14 -> /nfs
> >> lr-x------ 1 root root 64 Aug 14 22:08 5 -> /nfs
> >>
> >> I've investigated the problem and discovered that the leak happens in
> >> do_umount_autofs_direct():
> >>
> >> int do_umount_autofs_direct(struct autofs_point *ap, struct mnt_list
> >> *mnts, struct mapent *me)
> >> {
> >> (...)
> >>       if (me->ioctlfd != -1) {
> >>               if (tree_is_mounted(mnts, me->key, MNTS_REAL)) {
> >>                       error(ap->logopt,
> >>                             "attempt to umount busy direct mount %s",
> >>                             me->key);
> >>                       return 1;
> >>               }
> >>               ioctlfd = me->ioctlfd;
> >>       } else  // ioctlfd == -1
> >>               ops->open(ap->logopt, &ioctlfd, me->dev, me->key);  <= we open it here
> >>
> >>       if (ioctlfd >= 0) {
> >>               unsigned int status = 1;
> >>
> >>               rv = ops->askumount(ap->logopt, ioctlfd, &status);
> >>                               /// at this point, rv == 0 and status == 0
> >>               if (rv) {
> >>                       char *estr = strerror_r(errno, buf, MAX_ERR_BUF);
> >>                       error(ap->logopt, "ioctl failed: %s", estr);
> >>                       return 1;
> >>               } else if (!status) {
> >>                               /// at this point, ap->state == ST_READMAP
> >>                       if (ap->state != ST_SHUTDOWN_FORCE) {
> >>                               error(ap->logopt,
> >>                                     "ask umount returned busy for %s",
> >>                                     me->key);
> >>                               return 1;                       <= we return here, without closing the fd
> >>                       } else {
> >>                               me->ioctlfd = -1;
> >>                               ops->catatonic(ap->logopt, ioctlfd);
> >>                               ops->close(ap->logopt, ioctlfd);
> >>                               goto force_umount;
> >>                       }
> >> (...)
> >
> > Yes, that does leak descriptors for the case where it was opened in the
> > function. I see the same problem in umount_autofs_offset() too.
> 
> Indeed, thanks for the pointer.
> 
> >> I wrote a patch to workaround the leak because I still don't know how
> >> to fix the root cause, which I believe is to remove all references to the
> >> old map entry and unmount everything before creating the new entries.
> >
> > That would be best but the old entry might have an active mount which
> > means it needs to be kept around.
> 
> By active mount you mean a real NFS mount or just the AutoFS trigger?

Yep.

> 
> In the example I used above, the problem happens when there's only the
> AutoFS trigger in /nfs (no real NFS volume mounted). When there is
> an NFS volume mounted, everything seems to work fine and there's no
> leak.

Yeah, that is a curious symptom.

When there is no active NFS (or other) mount present the ->askumount()
call should return a status of one so this shouldn't happen. But that
may depend on the kernel version since older kernels might allow path
walks into the mount when the map read is going on in some cases.

But there's also the shared subtree code which might be causing the
mount to appear busy for a while after it is actually not. That's why
you see the umount retry a bit later in the function, umount(2) often
returns EBUSY for a while after the last file handle is closed and that
started happening following the shared subtree changes. I could never
track down why it does that so couldn't do anything about it.

> 
> >> There's also some open points in the workaround:
> >> - should we call ops->catatonic() on the ioctlfd like it's done in the other
> >>   error paths?
> >
> > Probably should do that before the umount since it prevents any further
> > requests from being done and completes any pending requests.
> >
> >> - I've opted to close ioctlfd only if it was opened in this function. Is
> >>   this correct or should we close it regardless if it was opened before
> >>   the call?
> >
> > Not sure, I'll need to have a closer look.
> > If it has an open file handle it probably has an active mount which
> > means the descriptor should remain open for subsequent expires to work.
> 
> I've found another collateral effect of the workaround patch (not sure
> if this is exactly what you said above): when I simulate the problem with
> the patched daemon and then stop it, the AutoFS trigger won't be
> unmounted.
> 
> I'll try to make more progress on this.
> 
> Thanks,
> Leonardo


--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux