Concurrency problem when triggering many mounts simultaneously

Leonardo Chiquitto <leonardo.lists@xxxxxxxxx> · Sun, 28 Oct 2012 20:37:44 -0200

Hello,

This is a follow up to my last email. The fact that AutoFS now probes the
proximity of single servers exposed an interesting problem.

One way to reproduced it is to setup a map with 10 or more direct mounts.
These volumes must be hosted at the same NFS server.

So lets say we have /autofs-race/dir{1,2,3,4...} and each volume has a
file named 'file'. By triggering the mount of the 10 volumes simultaneously,
you'll notice that some of them will fail to mount:

n43:~ # for i in $(seq 1 10); do stat /autofs-race/dir$i/file > /dev/null & done
(.. shell prints 10 pids that are running in background ..)
n43:~ # stat: cannot stat ‘/autofs-race/dir4/file’: No such file or directory
stat: cannot stat ‘/autofs-race/dir10/file’: No such file or directory

[1]   Done                    stat /autofs-race/dir$i/file > /dev/null
[2]   Done                    stat /autofs-race/dir$i/file > /dev/null
[3]   Done                    stat /autofs-race/dir$i/file > /dev/null
[4]   Exit 1                  stat /autofs-race/dir$i/file > /dev/null
[5]   Done                    stat /autofs-race/dir$i/file > /dev/null of
[6]   Done                    stat /autofs-race/dir$i/file > /dev/null
[7]   Done                    stat /autofs-race/dir$i/file > /dev/null
[8]   Done                    stat /autofs-race/dir$i/file > /dev/null
[9]-  Done                    stat /autofs-race/dir$i/file > /dev/null
[10]+  Exit 1                  stat /autofs-race/dir$i/file > /dev/null

Here it failed to mount /autofs-race/dir4 and /autofs-race/dir10.

I've investigated this and discovered that:

* The problem happens because prune_host_list() removes the only host
  from the hosts' list. get_nfs_info() succeeds for some protocols but
  eventually receives an ETIMEOUT and returns the host doesn't support
  any protocol, hence get_vers_and_cost() fails.
* It only happens when when the RPC clients are created by clnt_dg_create()
  or clnt_vc_create()) (the default when libtirpc is used). However, if I keep
  building with libtirpc and change only these calls to clntudp_bufcreate()
  and clnttcp_create() respectively, the problem doesn't happen.
* The transport protocol in the conn_info structure seems to change behind its
  feet. This is very strange, and I may be doing something wrong here, but
  using debug code such as the snippet below in nfs_get_info() can
  demonstrate it:

+        x = rpc_info->proto->p_proto;
        if (rpc_info->proto->p_proto == IPPROTO_UDP)
                status = rpc_udp_getclient(rpc_info, NFS_PROGRAM, NFS3_VERSION);
        else
                status = rpc_tcp_getclient(rpc_info, NFS_PROGRAM, NFS3_VERSION);
+        if (x != rpc_info->proto->p_proto)
+                logmsg("%lu p_proto changed (%d -> %d)", pthread_self(),
+                      x, rpc_info->proto->p_proto);

* Triggering the mounts simultaneously is required to reproduce the
problem, which
  makes me think if something here (libtirpc for example) is not
really thread safe
  or if the RPC clients must be destroyed after every use to avoid such issues.
* Another theory I wasn't able to test yet is if due to some
build/link issue, some
  RPC functions from glibc are still being used, even when libtirpc is
available.
  Is it possible? Could the mix cause the problem?

I planned to investigate more to provide a better report and perhaps a fix, but
as I'm not making much progress in the last couple of days, I'm reporting it
now.

Thanks,
Leonardo
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html