On Mon, 20 Feb 2023, Ian Kent wrote: > On 20/2/23 06:40, NeilBrown wrote: > > Hi, > > I have a customer who is experiencing problems with automountd. I > > think I know what is happening, but I'm not sure if what I imagine is > > possible, or what the best solution is. > > > > The kernel is 4.12 and automountd is 5.1.3 - so not the newest, but not > > ancient. I cannot see any changes since that look like they might be > > relevant. > > > > The problem is that after a while automountd stops expiring direct > > mounts, and doesn't mount any new direct mounts that are added to the > > map. > > > > When this happens an automountd thread has sent an > > AUTOFS_IOC_EXPIRE_MULTI ioctl to the kernel, the kernel has sent a > > NFY_EXPIRE back up to automountd. automountd reported > > > > handle_packet_expire_direct: can't find map entry for .... > > > > and the kernel never gets an ACK for the message and things hang. > > Yes, that case is fatal. > > > Because the kernel communications pipe might not be able to convey > > the direct mount path a bogus value is encoded into the packet and > > an inode number to path index is used to lookup the path. Without > > the path we can't continue. > > > But this hasn't happened to me for a long time. > > > > > > When I look, the mount point that the kernel is asking automountd to > > expire has already been unmounted. > > That's not right ... > > > > > > The mount map uses LDAP and changes quite often. My guess is that > > automountd notices that some directory has been removed from the map, > > and so removes the map entry. This presumably races with the expiry > > process. The mount gets unmounted because it is removed from the map > > at the same time that expiry wants to remove it, and confusion results. > > That sounds different to the terminology I'd use but I think I get what > > your saying. > > > I would describe it as, a map entry has been removed from the map when > > it's in use causing expires for that map entry to be done on an entry > > that's been removed from the index we need for the map entry lookup. > > This map entry shouldn't be removed in this case. > > > > > > My current thought for a solution is to change the way the kernel waits > > for NFY_EXPIRE replies. Instead of waiting indefinitely it waits with > > a timeout. If the wait times out and the filesystem is still mounted, > > it just loops around and waits again. If after the timeout the > > filesystem has been unmounted it waits one more time (just in case > > automountd is about to reply) and then aborts the wait with -EAGAIN. > > I've provided the customer with a patch to do this using a 5 second > > wait. I don't have test results yet. > > I really don't think this is a kernel problem, it's a user space problem. > > > Some time ago there was a weird case where an active map entry was being > > removed from the map entry cache. I had a little trouble even working out > > what I had done when I cam across it in a clean up a while ago. So if > > this is what your seeing we'll need to do some work to work out what > > I saw and what I was doing to fix it. > > > Let me check 5.1.3 and get back to you. > > > > > > So my questions are: > > - is this race really possible? Can removal-from-map race with expiry? > > Well, maybe but it shouldn't because walking into an expiring mount > > or one that's being mounted shouldn't be possible and I haven't seen > > symptoms of that happening for a very long time, certainly not with > > a kernel as recent as 4.12. > > > I really think it's a mistake I'm making in the user space code. > > > > - is my timeout fix reasonable? Might it cause other problems? Is > > there a better way to fix this inside automountd? > > Probably and don't know. > > > I think user space is the problem here and I suspect trying to change > > the kernel won't actually fix the problem because it's a user space > > mistake that could still happen. > > > I'm not sure about the wisdom of my not trying to recover from this > > either. Originally it was done because if this happened things would > > only get worse and the problem would become hidden. So I made the fail > > fatal so I could get a core of the state at the time it happened and > > that would be more likely to yield information about the cause. And > > this should never happen so the only choice is to fix it. > Thanks - you've given me some useful pointers. I'll look some more. I have a core of automountd while it is hanging (so after the initial problem) and also a core of the kernel. So if you do find more time to look and want me to find something in a core file, just let me know. Thanks, NeilBrown