Re: Massive NFS problems on large cluster with large number of mounts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 30, 2008, at 6:13 PM, J. Bruce Fields wrote:
On Wed, Jul 30, 2008 at 03:33:38PM -0400, Chuck Lever wrote:
On Wed, Jul 30, 2008 at 1:53 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx > wrote:
In any case, this all seems a bit orthogonal to the problem of what
ports the rpcbind client uses, right?

No, this is exactly the original problem. The reason xprt_maxresvport is allowed to go larger than 1023 is to permit more NFS mounts. There
really is no other reason for it I can think of.

But it's broken (or at least inconsistent) behavior that max_resvport
can go past 1023 in the first place.  The name is "max_resvport" --
Maximum Reserved Port.  A port value of more than 1024 is not a
reserved port.  These sysctls are designed to restrict the range of
ports used when a _reserved_ port is requested, not when _any_ source
port is requested. Trond's suggestion is an "off label" use of this
facility.

We could do a better job of communicating what is and isn't a documented
usage, in that case.

Once people are already using an interface a certain way (and because we told them to) discussions about whether it's really a correct use start
to seem a little academic.

It's not at all academic.

We _must_ revisit interface design whenever we have a design that results in a kernel paging exception, a privilege escalation or denial of service, or it's simply confusing or using standard terminology incorrectly. It is always appropriate to talk about it.

What we need to be careful about when people are already using an interface is how we go about changing it.

And rpcbind isn't the only kernel-level RPC service that requires a
reserved port.  The kernel-level NSM code that calls user space, for
example, is one such service.  In other words, rpcbind isn't the only
service that could potentially hit this issue, so an rpcbind-only fix
would be incomplete.

We already have an appropriate interface for kernel RPC services to
request a non-privileged port.  The NFS client should use that
interface.

I admit that would be nicer.

--b.

Now, we don't have to change both at the same time.  We can introduce
the mount option now; the default reserved port range is still good.
And eventually folks using the sysctl will hit the rpcbind bug (or a
lock recovery problem), trace it back to this issue, and change their
mount options and reset their resvport sysctls.

At some later point, though, the maximum should be restricted to 1023.

Such an "insecure" mount option would then set
RPC_CLNT_CREATE_NONPRIVPORT on rpc_clnt's created on behalf of the NFS
client.

I'm not married to the names of the options, or even using a mount
option at all (although that seems like a natural place to put such a
feature).

Thoughts?

--
Chuck Lever
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux