Re: Regular deadlocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/27/2016 02:26 AM, Ian Kent wrote:
How is autofs configured.

If --disable-mount-locking is not used then any mount can block all other
mounts, if it is used then there can be mtab corruption if still using a text
based mtab.

I use --disable-mount-locking.

I always use --disable-mount-locking and nowadays the mtab is usually a symlink
into the proc file system so corruption isn't a problem.

/etc/mtab is actually not a symlink on my systems.


Anyway, I have more details for you as the issue appeared today and I could investigate some more. This is on a server that only mounts one single NFS server (http12), so the multi-servers blocking issue is irrelevant here.

A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted by autofs, I assume because it was idle. I have TIMEOUT=600. That explains why the issue appears much more frequently on a server which is way less busy (and usually in the middle of the night): the NFS server needs to be idle enough to be unmounted.

However, I still had many /home/userX mounted (by autofs), which point to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at least one /home/userX is mounted? To be clear, here's an extract from my /proc/mounts BEFORE the NFS server is unmounted by autofs:

http12:/ /nfs/http12 nfs4 rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:1::1 0
 0
http12://user1 /home/user1 nfs4 rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:
1::1 0 0
http12://user2 /home/user2 nfs4 rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:2
0:1::1 0 0


Also, I couldn't find any blocked mount process that would explain the "deadlock". I had a 'ps aux|grep mount' done every 10 seconds:

Mon Jun 27 05:00:00 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:26 /usr/sbin/automount --pid-file /var/run/autofs.pid

Mon Jun 27 05:00:10 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618215 0.0 0.0 0 0 ? Z 05:00 0:00 [umount] <defunct> root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618230 0.0 0.0 0 0 ? Z 05:00 0:00 [umount] <defunct> root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618252 0.1 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid

Mon Jun 27 05:00:20 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618252 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618701 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid root 2618702 0.0 0.0 218676 2168 ? S 05:00 0:00 /usr/sbin/automount --pid-file /var/run/autofs.pid

And it remained in that state afterwards. I don't know if the defunct umount are suspicious, I guess not.

One last thing: a manual umount of /home/userY was done by a script at 6:26 (/home/userY was NOT mounted though), and it remained blocked. I'm not sure if it's a consequence of autofs being blocked or something else.

--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in



[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux