Re: Help with autofs hang

"NeilBrown" <neilb@xxxxxxx> · Fri, 03 Mar 2023 07:34:46 +1100

On Thu, 02 Mar 2023, Ian Kent wrote:
> On 20/2/23 12:49, NeilBrown wrote:
> > On Mon, 20 Feb 2023, Ian Kent wrote:
> >> On 20/2/23 06:40, NeilBrown wrote:
> >>> Hi,
> >>>    I have a customer who is experiencing problems with automountd.  I
> >>>    think I know what is happening, but I'm not sure if what I imagine is
> >>>    possible, or what the best solution is.
> >>>
> >>>    The kernel is 4.12 and automountd is 5.1.3 - so not the newest, but not
> >>>    ancient.  I cannot see any changes since that look like they might be
> >>>    relevant.
> >>>
> >>>    The problem is that after a while automountd stops expiring direct
> >>>    mounts, and doesn't mount any new direct mounts that are added to the
> >>>    map.
> >>>    
> >>>    When this happens an automountd thread has sent an
> >>>    AUTOFS_IOC_EXPIRE_MULTI ioctl to the kernel, the kernel has sent a
> >>>    NFY_EXPIRE back up to automountd.  automountd reported
> >>>
> >>>      handle_packet_expire_direct: can't find map entry for ....
> >>>
> >>>    and the kernel never gets an ACK for the message and things hang.
> >> Yes, that case is fatal.
> >>
> >>
> >> Because the kernel communications pipe might not be able to convey
> >>
> >> the direct mount path a bogus value is encoded into the packet and
> >>
> >> an inode number to path index is used to lookup the path. Without
> >>
> >> the path we can't continue.
> >>
> >>
> >> But this hasn't happened to me for a long time.
> >>
> >>
> >>>    When I look, the mount point that the kernel is asking automountd to
> >>>    expire has already been unmounted.
> >> That's not right ...
> >>
> >>
> >>>    The mount map uses LDAP and changes quite often.  My guess is that
> >>>    automountd notices that some directory has been removed from the map,
> >>>    and so removes the map entry.  This presumably races with the expiry
> >>>    process.  The mount gets unmounted because it is removed from the map
> >>>    at the same time that expiry wants to remove it, and confusion results.
> >> That sounds different to the terminology I'd use but I think I get what
> >>
> >> your saying.
> >>
> >>
> >> I would describe it as, a map entry has been removed from the map when
> >>
> >> it's in use causing expires for that map entry to be done on an entry
> >>
> >> that's been removed from the index we need for the map entry lookup.
> >>
> >> This map entry shouldn't be removed in this case.
> >>
> >>
> >>>    
> >>>    My current thought for a solution is to change the way the kernel waits
> >>>    for NFY_EXPIRE replies.  Instead of waiting indefinitely it waits with
> >>>    a timeout.  If the wait times out and the filesystem is still mounted,
> >>>    it just loops around and waits again.  If after the timeout the
> >>>    filesystem has been unmounted it waits one more time (just in case
> >>>    automountd is about to reply) and then aborts the wait with -EAGAIN.
> >>>    I've provided the customer with a patch to do this using a 5 second
> >>>    wait.  I don't have test results yet.
> >> I really don't think this is a kernel problem, it's a user space problem.
> >>
> >>
> >> Some time ago there was a weird case where an active map entry was being
> >>
> >> removed from the map entry cache. I had a little trouble even working out
> >>
> >> what I had done when I cam across it in a clean up a while ago. So if
> >>
> >> this is what your seeing we'll need to do some work to work out what
> >>
> >> I saw and what I was doing to fix it.
> >>
> >>
> >> Let me check 5.1.3 and get back to you.
> >>
> >>
> >>>    So my questions are:
> >>>     - is this race really possible? Can removal-from-map race with expiry?
> >> Well, maybe but it shouldn't because walking into an expiring mount
> >>
> >> or one that's being mounted shouldn't be possible and I haven't seen
> >>
> >> symptoms of that happening for a very long time, certainly not with
> >>
> >> a kernel as recent as 4.12.
> >>
> >>
> >> I really think it's a mistake I'm making in the user space code.
> >>
> >>
> >>>     - is my timeout fix reasonable?  Might it cause other problems?  Is
> >>>       there a better way to fix this inside automountd?
> >> Probably and don't know.
> >>
> >>
> >> I think user space is the problem here and I suspect trying to change
> >>
> >> the kernel won't actually fix the problem because it's a user space
> >>
> >> mistake that could still happen.
> >>
> >>
> >> I'm not sure about the wisdom of my not trying to recover from this
> >>
> >> either. Originally it was done because if this happened things would
> >>
> >> only get worse and the problem would become hidden. So I made the fail
> >>
> >> fatal so I could get a core of the state at the time it happened and
> >>
> >> that would be more likely to yield information about the cause. And
> >>
> >> this should never happen so the only choice is to fix it.
> >>
> > Thanks - you've given me some useful pointers.  I'll look some more.
> >
> > I have a core of automountd while it is hanging (so after the initial
> > problem) and also a core of the kernel.  So if you do find more time to
> > look and want me to find something in a core file, just let me know.
> 
> Umm ... sounds like you didn't see my second reply to this.
> 
> It refers to a commit that resolves a problem that sounds a lot like
> 
> what your seeing?
> 
> https://www.spinics.net/lists/autofs/msg02557.html
> 

Hi Ian,
 I did see that - I should have replied.

 I agree the issue that patch addresses is superficially similar.
 It involves direct mounts not being expired.
 However I have other symptoms that don't match.  Specifically:
 1/ new direct mounts that appear in the map don't take effect
 2/ an automount thread is blocked on an EXPIRE ioctl and when
    woken up (e.g. just running strace of the process can do that)
    the symptoms disappear.

 The fix in the patch is to mark a cache entry as stale so future
 lookups won't find it.  This is almost exactly the reverse of what I
 think I need.  When the kernel receives a NFY_EXPIRE from the kernel it
 fails to find any matching entry in the cache.

 I've gone over the core files I have and still have not found anything
 definitive.

 The vmcore of the kernel makes it clear that the dev/ino in the
 NFY_EXPIRE request correctly match the dev/ino of the fd that is passed
 down in the EXPIRE ioctl.
 The core of automount (taken at a different instance of the bug so I
 cannot compare dev/ino between core and vmcore) show that the cache
 entry which contains that fd is still in the cache and a lookup by
 dev/ino would find it.

 That seems to suggest that the dev/ino in a cache entry do not match
 the ioctl_fd.  However I've hunted in the code and cannot find any way
 that would happen.

 I've provided the customer with an 'automount' package which calls
 abort() when the lookup fails in the NFY_EXPIRE handler.  Hopefully
 this will provide more clues.

Thanks,
NeilBrown