When start_statd figures out that statd is not yet running it starts it, waits for the invoked process to complete, and finally verifies that statd is working. This approach works for serially mounting NFS file systems but has a race condition for parallel mounting. In the parallel case it can happen that two mount commands A and B both decide that statd needs to be started. Both of them try to start statd. Obviously only one of them can successfully do so, let's assume this is command A in our case. The statd invoked by B terminates because the resource is already claimed by the statd invoked by A. The termination of B's statd though is before the statd of A has completely set up all things. This causes the check for a working statd of command B to fail and terminate the mount request with an error. To prevent this we define a timeout value. In case the initial check after invoking statd fails we try again in a loop 10 times a second until the timeout is reached. In our tests when the race occurred we typically were successful already on the first retry within the loop. --- utils/mount/network.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/utils/mount/network.c b/utils/mount/network.c index e2cdcaf..670767d 100644 --- a/utils/mount/network.c +++ b/utils/mount/network.c @@ -58,6 +58,7 @@ #define PMAP_TIMEOUT (10) #define CONNECT_TIMEOUT (20) #define MOUNT_TIMEOUT (30) +#define STATD_TIMEOUT (10) #define SAFE_SOCKADDR(x) (struct sockaddr *)(char *)(x) @@ -773,6 +774,11 @@ int start_statd(void) #ifdef START_STATD if (stat(START_STATD, &stb) == 0) { if (S_ISREG(stb.st_mode) && (stb.st_mode & S_IXUSR)) { + int cnt = STATD_TIMEOUT * 10; + struct timespec ts = { + .tv_sec = 0, + .tv_nsec = 100000000, + }; pid_t pid = fork(); switch (pid) { case 0: /* child */ @@ -788,6 +794,11 @@ int start_statd(void) } if (nfs_probe_statd()) return 1; + while (cnt--) { + nanosleep(&ts, NULL); + if (nfs_probe_statd()) + return 1; + } } } #endif -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html