Re: Massive NFS problems on large cluster with large number of mounts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 02, 2008 at 04:00:21PM +0200, Carsten Aulbert wrote:
> Hi all,
> 
> 
> J. Bruce Fields wrote:
> > 
> > I'm slightly confused--the above is all about server configuration, but
> > the below seems to describe only client problems?
> 
> Well, yes and no. All our servers are clients as well. I.e. we have
> ~1340 nodes which all export a local directory to be cross-mounted.
> 
> >> (1) All our mounts use nfsvers=3 why is rpc.idmapd involved at all?
> > 
> > Are there actually files named "idmap" in those directories?  (Looks to
> > me like they're only created in the v4 case, so I assume those open
> > calls would return ENOENT if they didn't return ENFILE....)
> 
> No there is not and since we are not running v4 yet, we've disabled the
> start for these on all nodes now.
> 
> 
> > 
> >> (2) Why is this daemon growing so extremely large?
> >> # ps aux|grep rpc.idmapd
> >> root      2309  0.1 16.2 2037152 1326944 ?     Ss   Jun30   1:24
> >> /usr/sbin/rpc.idmapd
> > 
> > I think rpc.idmapd has some state for each directory whether they're for
> > a v4 client or not, since it's using dnotify to watch for an "idmap"
> > file to appear in each one.  The above shows about 2k per mount?
> 
> As you have written in your other email, yes that's 2 GByte and I've
> seen boxes where > 500 mounts hung that the process was using all of the
> 8 GByte. So I do think there is a bug.
> 
> OTOH, we still have the problem, that we can only mount up to ~ 350
> remote directories. This one we think we tracked down to the fact that
> the NFS clients refuse to use ports >1023 even though the servers are
> exporting with the "insecure" option. Is there a way to force this?
> Right now the NFS clients use ports 665-1023 (except a few odd ports
> which were in use earlier).
> 
> Any hint for us how we shall proceed and maybe force the clients to also
> use ports > 1023? I think that would solve our problems.

I think the below (untested) would tell the client to stop demanding a
privileged port.

Then you may find you run into other problems, I don't know.  Sounds
like nobody's using this many mounts, so you get to find out what the
next limit is....  But if it works, then maybe someday we should add a
mount option to control this.

--b.


diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 8945307..51f68cc 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -300,9 +300,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
 	 * but it is always enabled for rpciod, which handles the connect
 	 * operation.
 	 */
-	xprt->resvport = 1;
-	if (args->flags & RPC_CLNT_CREATE_NONPRIVPORT)
-		xprt->resvport = 0;
+	xprt->resvport = 0;
 
 	clnt = rpc_new_client(args, xprt);
 	if (IS_ERR(clnt))
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux