Re: vfs_listxattr and the NFS server: namespaces

Chuck Lever <chuck.lever@xxxxxxxxxx> · Sun, 29 Mar 2020 17:54:38 -0400

> On Mar 29, 2020, at 4:25 PM, Frank van der Linden <fllinden@xxxxxxxxxx> wrote:
> 
> On Fri, Mar 27, 2020 at 11:27:16PM +0000, Frank van der Linden wrote:
>> Implement the main entry points for the *XATTR operations.
>> 
>> Add functions to calculate the reply size for the user extended attribute
>> operations, and implement the XDR encode / decode logic for these
>> operations.
>> 
>> Add the user extended attributes operations to nfsd4_ops.
> 
> The reason I Cc-ed this one to linux-fsdevel, is the following piece of
> code:
>> +	/*
>> +	 * Unfortunately, there is no interface to only list xattrs for
>> +	 * one prefix. So there is no good way to convert maxcount to
>> +	 * a maximum value to pass to vfs_listxattr, as we don't know
>> +	 * how many of the returned attributes will be user attributes.
>> +	 *
>> +	 * So, always ask vfs_listxattr for the maximum size, and encode
>> +	 * as many as possible.
>> +	 */
>> +	listxattrs->lsxa_buf = svcxdr_tmpalloc(argp, XATTR_LIST_MAX);
>> +	if (!listxattrs->lsxa_buf)
>> +		status = nfserr_jukebox;
>> +
>> +	DECODE_TAIL;
>> +}
> 
> Naturally, it is a waste to always (temporarily) allocate XATTR_LIST_MAX
> (currently 64k) bytes for every listxattr request, when in reality you
> probably won't need anywhere near that.
> 
> The reason for doing that is that, while the NFS request comes with a 
> maximum size, you can't translate that in to a maximum size to pass
> to vfs_listxattr, since you are specifying the max size for "user."
> extended attributes only, since that is the namespace that the NFS
> implementation is restricted to. But vfs_listxattr always retrieves
> all extended attributes.
> 
> The question is then: how to avoid doing that? It might be a good
> idea to be able to specify a namespace to vfs_listxattr. This isn't
> terribly hard to implement, and can be phased in. E.g:
> 
> * Add a "const char *namespace" argument to the listxattr inode op
> * For the normal use case it will be NULL, meaning all xattrs. This
>  includes use in the system call path (a new system call that
>  includes a namespace argument could be added).
> * A filesystem that does not support only returning xattrs for one
>  namespace will return a specified error if @namespace != NULL, after
>  which the fs-indepdent code in fs/xattr.c falls back to pre-allocating
>  an XATTR_LIST_MAX buffer and then filtering out all attributes that
>  are not in the specified namespace.
> * Since the initial use is from the NFS code, converting XFS and ext4
>  to support the @namespace argument should catch most use cases of
>  the @namespace argument, the rest can be converted over time.
> 
> Does this sound reasonable? Or is it overkill?
> 
> Another way to do it is to add a vfs_listxattr_alloc variant, and
> have the NFS server code filter out the user. xattrs as now. There is
> one problem with that: it's possible for the size of listxattr result
> to increase between the first null (length probe) call and the second call,
> in which case the NFS server would return a spurious error. A small chance,
> but it's there nonetheless. And it would be hard to distinguish from
> other errors. So I think I still would prefer using an extra namespace
> argument.
> 
> Alternatively, we could decide that the temporarily allocation of 64k
> is no big deal.

An order-4 allocation is not something that can be relied upon. I would
rather find a way to avoid the need for it if at all possible.

--
Chuck Lever