Re: [PATCH v2 3/3] NFSD: Clean up symlink argument XDR decoders

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 22, 2018 at 05:09:35PM -0800, Chuck Lever wrote:
> 
> 
> > On Jan 22, 2018, at 2:00 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> > 
> > On Wed, Jan 03, 2018 at 03:42:35PM -0500, Chuck Lever wrote:
> >> Move common code in NFSD's symlink arg decoders into a helper. The
> >> immediate benefits include:
> >> 
> >> - one fewer data copies on transports that support DDP
> >> - no memory allocation in the NFSv4 XDR decoder
> >> - consistent error checking across all versions
> >> - reduction of code duplication
> >> - support for both legal forms of SYMLINK requests on RDMA
> >>   transports for all versions of NFS (in particular, NFSv2, for
> >>   completeness)
> >> 
> >> In the long term, this helper is an appropriate spot to perform a
> >> per-transport call-out to fill the pathname argument using, say,
> >> RDMA Reads.
> >> 
> >> Filling the pathname in the proc function also means that eventually
> >> the incoming filehandle can be interpreted so that filesystem-
> >> specific memory can be allocated as a sink for the pathname
> >> argument, rather than using anonymous pages.
> >> 
> >> Wondering why the current code punts a zero-length SYMLINK. Is it
> >> illegal to create a zero-length SYMLINK on Linux?
> > 
> > SYMLINK(2) says
> > 
> > 	ENOENT A directory component in linkpath does not exist or is a
> > 	dangling symbolic link, or target or linkpath is an empty
> > 	string.
> > 
> > That doesn't explain the INVAL, or why this is the right place to be
> > checking it.
> 
> RFC 1813:
> 
> NFS3ERR_IO
>       NFS3ERR_ACCES
>       NFS3ERR_EXIST
>       NFS3ERR_NOTDIR
>       NFS3ERR_NOSPC
>       NFS3ERR_ROFS
>       NFS3ERR_NAMETOOLONG
>       NFS3ERR_DQUOT
>       NFS3ERR_STALE
>       NFS3ERR_BADHANDLE
>       NFS3ERR_NOTSUPP
>       NFS3ERR_SERVERFAULT
> 
> Interestingly, neither INVAL nor NOENT are valid
> status codes for NFSv3 SYMLINK. NFS3ERR_NOTSUPP
> might be closest, I suppose.
> 
> RFC 5661 says explicitly:
>  
> If the objname has a length of zero, or if objname does not obey the
> UTF-8 definition, the error NFS4ERR_INVAL will be returned.
> 
> And lists these as valid status codes for CREATE(NF4LNK):
>  
> | NFS4ERR_ACCESS, NFS4ERR_ATTRNOTSUPP,       |
> | NFS4ERR_BADCHAR, NFS4ERR_BADNAME,          |
> | NFS4ERR_BADOWNER, NFS4ERR_BADTYPE,         |
> | NFS4ERR_BADXDR, NFS4ERR_DEADSESSION,       |
> | NFS4ERR_DELAY, NFS4ERR_DQUOT,              |
> | NFS4ERR_EXIST, NFS4ERR_FHEXPIRED,          |
> | NFS4ERR_INVAL, NFS4ERR_IO, NFS4ERR_MLINK,  |
> | NFS4ERR_MOVED, NFS4ERR_NAMETOOLONG,        |
> | NFS4ERR_NOFILEHANDLE, NFS4ERR_NOSPC,       |
> | NFS4ERR_NOTDIR, NFS4ERR_OP_NOT_IN_SESSION, |
> | NFS4ERR_PERM, NFS4ERR_REP_TOO_BIG,         |
> | NFS4ERR_REP_TOO_BIG_TO_CACHE,              |
> | NFS4ERR_REQ_TOO_BIG,                       |
> | NFS4ERR_RETRY_UNCACHED_REP, NFS4ERR_ROFS,  |
> | NFS4ERR_SERVERFAULT, NFS4ERR_STALE,        |
> | NFS4ERR_TOO_MANY_OPS,                      |
> | NFS4ERR_UNSAFE_COMPOUND                    |
> 
> 
> > I'm a little nervous about the NULL termination in
> > svc_fill_symlink_pathname; how do we know it's safe to write a zero
> > there?  I haven't checked it carefully yet.
> 
> svc_fill_symlink_pathname grabs a whole fresh page
> from @rqstp. It is safe to write bytes anywhere in
> that page.

How do we know it's safe to grab another page?  (Could we run out of
pages?  I'd probably know the answer if I'd rewritten this code
recently, but I haven't and the details are swapped out....)

Also I don't think that's true in the first->iov_base == 0 case.

--b.

> 
> 
> > --g.
> > 
> >> 
> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >> ---
> >> fs/nfsd/nfs3proc.c         |   10 +++++++
> >> fs/nfsd/nfs3xdr.c          |   51 ++++++++-------------------------
> >> fs/nfsd/nfs4proc.c         |    7 +++++
> >> fs/nfsd/nfs4xdr.c          |   10 +++++--
> >> fs/nfsd/nfsproc.c          |   14 +++++----
> >> fs/nfsd/nfsxdr.c           |   49 +++++++++++++++++++-------------
> >> fs/nfsd/xdr.h              |    1 +
> >> fs/nfsd/xdr3.h             |    1 +
> >> fs/nfsd/xdr4.h             |    2 +
> >> include/linux/sunrpc/svc.h |    2 +
> >> net/sunrpc/svc.c           |   67 ++++++++++++++++++++++++++++++++++++++++++++
> >> 11 files changed, 146 insertions(+), 68 deletions(-)
> >> 
> >> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> >> index 2dd95eb..6259a4b 100644
> >> --- a/fs/nfsd/nfs3proc.c
> >> +++ b/fs/nfsd/nfs3proc.c
> >> @@ -283,6 +283,16 @@
> >> 	struct nfsd3_diropres *resp = rqstp->rq_resp;
> >> 	__be32	nfserr;
> >> 
> >> +	if (argp->tlen == 0)
> >> +		RETURN_STATUS(nfserr_inval);
> >> +	if (argp->tlen > NFS3_MAXPATHLEN)
> >> +		RETURN_STATUS(nfserr_nametoolong);
> >> +
> >> +	argp->tname = svc_fill_symlink_pathname(rqstp, &argp->first,
> >> +						argp->tlen);
> >> +	if (IS_ERR(argp->tname))
> >> +		RETURN_STATUS(nfserrno(PTR_ERR(argp->tname)));
> >> +
> >> 	dprintk("nfsd: SYMLINK(3)  %s %.*s -> %.*s\n",
> >> 				SVCFH_fmt(&argp->ffh),
> >> 				argp->flen, argp->fname,
> >> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> >> index 240cdb0e..78b555b 100644
> >> --- a/fs/nfsd/nfs3xdr.c
> >> +++ b/fs/nfsd/nfs3xdr.c
> >> @@ -452,51 +452,24 @@ void fill_post_wcc(struct svc_fh *fhp)
> >> nfs3svc_decode_symlinkargs(struct svc_rqst *rqstp, __be32 *p)
> >> {
> >> 	struct nfsd3_symlinkargs *args = rqstp->rq_argp;
> >> -	unsigned int len, avail;
> >> -	char *old, *new;
> >> -	struct kvec *vec;
> >> +	char *base = (char *)p;
> >> +	size_t dlen;
> >> 
> >> 	if (!(p = decode_fh(p, &args->ffh)) ||
> >> -	    !(p = decode_filename(p, &args->fname, &args->flen))
> >> -		)
> >> +	    !(p = decode_filename(p, &args->fname, &args->flen)))
> >> 		return 0;
> >> 	p = decode_sattr3(p, &args->attrs);
> >> 
> >> -	/* now decode the pathname, which might be larger than the first page.
> >> -	 * As we have to check for nul's anyway, we copy it into a new page
> >> -	 * This page appears in the rq_res.pages list, but as pages_len is always
> >> -	 * 0, it won't get in the way
> >> -	 */
> >> -	len = ntohl(*p++);
> >> -	if (len == 0 || len > NFS3_MAXPATHLEN || len >= PAGE_SIZE)
> >> -		return 0;
> >> -	args->tname = new = page_address(*(rqstp->rq_next_page++));
> >> -	args->tlen = len;
> >> -	/* first copy and check from the first page */
> >> -	old = (char*)p;
> >> -	vec = &rqstp->rq_arg.head[0];
> >> -	if ((void *)old > vec->iov_base + vec->iov_len)
> >> -		return 0;
> >> -	avail = vec->iov_len - (old - (char*)vec->iov_base);
> >> -	while (len && avail && *old) {
> >> -		*new++ = *old++;
> >> -		len--;
> >> -		avail--;
> >> -	}
> >> -	/* now copy next page if there is one */
> >> -	if (len && !avail && rqstp->rq_arg.page_len) {
> >> -		avail = min_t(unsigned int, rqstp->rq_arg.page_len, PAGE_SIZE);
> >> -		old = page_address(rqstp->rq_arg.pages[0]);
> >> -	}
> >> -	while (len && avail && *old) {
> >> -		*new++ = *old++;
> >> -		len--;
> >> -		avail--;
> >> -	}
> >> -	*new = '\0';
> >> -	if (len)
> >> -		return 0;
> >> +	args->tlen = ntohl(*p++);
> >> 
> >> +	args->first.iov_base = p;
> >> +	args->first.iov_len = rqstp->rq_arg.head[0].iov_len;
> >> +	args->first.iov_len -= (char *)p - base;
> >> +
> >> +	dlen = args->first.iov_len + rqstp->rq_arg.page_len +
> >> +	       rqstp->rq_arg.tail[0].iov_len;
> >> +	if (dlen < XDR_QUADLEN(args->tlen) << 2)
> >> +		return 0;
> >> 	return 1;
> >> }
> >> 
> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >> index 5029b96..36bd1f7 100644
> >> --- a/fs/nfsd/nfs4proc.c
> >> +++ b/fs/nfsd/nfs4proc.c
> >> @@ -605,6 +605,13 @@ static void gen_boot_verifier(nfs4_verifier *verifier, struct net *net)
> >> 
> >> 	switch (create->cr_type) {
> >> 	case NF4LNK:
> >> +		if (create->cr_datalen > NFS4_MAXPATHLEN)
> >> +			return nfserr_nametoolong;
> >> +		create->cr_data =
> >> +			svc_fill_symlink_pathname(rqstp, &create->cr_first,
> >> +						  create->cr_datalen);
> >> +		if (IS_ERR(create->cr_data))
> >> +			return nfserrno(PTR_ERR(create->cr_data));
> >> 		status = nfsd_symlink(rqstp, &cstate->current_fh,
> >> 				      create->cr_name, create->cr_namelen,
> >> 				      create->cr_data, &resfh);
> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >> index bd25230..d05384e 100644
> >> --- a/fs/nfsd/nfs4xdr.c
> >> +++ b/fs/nfsd/nfs4xdr.c
> >> @@ -648,6 +648,7 @@ static __be32 nfsd4_decode_bind_conn_to_session(struct nfsd4_compoundargs *argp,
> >> static __be32
> >> nfsd4_decode_create(struct nfsd4_compoundargs *argp, struct nfsd4_create *create)
> >> {
> >> +	struct kvec *head;
> >> 	DECODE_HEAD;
> >> 
> >> 	READ_BUF(4);
> >> @@ -656,10 +657,13 @@ static __be32 nfsd4_decode_bind_conn_to_session(struct nfsd4_compoundargs *argp,
> >> 	case NF4LNK:
> >> 		READ_BUF(4);
> >> 		create->cr_datalen = be32_to_cpup(p++);
> >> +		if (create->cr_datalen == 0)
> >> +			return nfserr_inval;
> >> +		head = argp->rqstp->rq_arg.head;
> >> +		create->cr_first.iov_base = p;
> >> +		create->cr_first.iov_len = head->iov_len;
> >> +		create->cr_first.iov_len -= (char *)p - (char *)head->iov_base;
> >> 		READ_BUF(create->cr_datalen);
> >> -		create->cr_data = svcxdr_dupstr(argp, p, create->cr_datalen);
> >> -		if (!create->cr_data)
> >> -			return nfserr_jukebox;
> >> 		break;
> >> 	case NF4BLK:
> >> 	case NF4CHR:
> >> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >> index 1995ea6..f107f9f 100644
> >> --- a/fs/nfsd/nfsproc.c
> >> +++ b/fs/nfsd/nfsproc.c
> >> @@ -449,17 +449,19 @@
> >> 	struct svc_fh	newfh;
> >> 	__be32		nfserr;
> >> 
> >> +	if (argp->tlen > NFS_MAXPATHLEN)
> >> +		return nfserr_nametoolong;
> >> +
> >> +	argp->tname = svc_fill_symlink_pathname(rqstp, &argp->first,
> >> +						argp->tlen);
> >> +	if (IS_ERR(argp->tname))
> >> +		return nfserrno(PTR_ERR(argp->tname));
> >> +
> >> 	dprintk("nfsd: SYMLINK  %s %.*s -> %.*s\n",
> >> 		SVCFH_fmt(&argp->ffh), argp->flen, argp->fname,
> >> 		argp->tlen, argp->tname);
> >> 
> >> 	fh_init(&newfh, NFS_FHSIZE);
> >> -	/*
> >> -	 * Crazy hack: the request fits in a page, and already-decoded
> >> -	 * attributes follow argp->tname, so it's safe to just write a
> >> -	 * null to ensure it's null-terminated:
> >> -	 */
> >> -	argp->tname[argp->tlen] = '\0';
> >> 	nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen,
> >> 						 argp->tname, &newfh);
> >> 
> >> diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
> >> index 165e25e..8fcd047 100644
> >> --- a/fs/nfsd/nfsxdr.c
> >> +++ b/fs/nfsd/nfsxdr.c
> >> @@ -71,22 +71,6 @@ __be32 *nfs2svc_decode_fh(__be32 *p, struct svc_fh *fhp)
> >> }
> >> 
> >> static __be32 *
> >> -decode_pathname(__be32 *p, char **namp, unsigned int *lenp)
> >> -{
> >> -	char		*name;
> >> -	unsigned int	i;
> >> -
> >> -	if ((p = xdr_decode_string_inplace(p, namp, lenp, NFS_MAXPATHLEN)) != NULL) {
> >> -		for (i = 0, name = *namp; i < *lenp; i++, name++) {
> >> -			if (*name == '\0')
> >> -				return NULL;
> >> -		}
> >> -	}
> >> -
> >> -	return p;
> >> -}
> >> -
> >> -static __be32 *
> >> decode_sattr(__be32 *p, struct iattr *iap)
> >> {
> >> 	u32	tmp, tmp1;
> >> @@ -383,14 +367,39 @@ __be32 *nfs2svc_encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *f
> >> nfssvc_decode_symlinkargs(struct svc_rqst *rqstp, __be32 *p)
> >> {
> >> 	struct nfsd_symlinkargs *args = rqstp->rq_argp;
> >> +	char *base = (char *)p;
> >> +	size_t xdrlen;
> >> 
> >> 	if (   !(p = decode_fh(p, &args->ffh))
> >> -	    || !(p = decode_filename(p, &args->fname, &args->flen))
> >> -	    || !(p = decode_pathname(p, &args->tname, &args->tlen)))
> >> +	    || !(p = decode_filename(p, &args->fname, &args->flen)))
> >> 		return 0;
> >> -	p = decode_sattr(p, &args->attrs);
> >> 
> >> -	return xdr_argsize_check(rqstp, p);
> >> +	args->tlen = ntohl(*p++);
> >> +	if (args->tlen == 0)
> >> +		return 0;
> >> +
> >> +	args->first.iov_base = p;
> >> +	args->first.iov_len = rqstp->rq_arg.head[0].iov_len;
> >> +	args->first.iov_len -= (char *)p - base;
> >> +
> >> +	/* This request is never larger than a page. Therefore,
> >> +	 * transport will deliver either:
> >> +	 * 1. pathname in the pagelist -> sattr is in the tail.
> >> +	 * 2. everything in the head buffer -> sattr is in the head.
> >> +	 */
> >> +	if (rqstp->rq_arg.page_len) {
> >> +		if (args->tlen != rqstp->rq_arg.page_len)
> >> +			return 0;
> >> +		p = rqstp->rq_arg.tail[0].iov_base;
> >> +	} else {
> >> +		xdrlen = XDR_QUADLEN(args->tlen);
> >> +		if (xdrlen > args->first.iov_len - (8 * sizeof(__be32)))
> >> +			return 0;
> >> +		p += xdrlen;
> >> +	}
> >> +	decode_sattr(p, &args->attrs);
> >> +
> >> +	return 1;
> >> }
> >> 
> >> int
> >> diff --git a/fs/nfsd/xdr.h b/fs/nfsd/xdr.h
> >> index a765c41..ea7cca3 100644
> >> --- a/fs/nfsd/xdr.h
> >> +++ b/fs/nfsd/xdr.h
> >> @@ -72,6 +72,7 @@ struct nfsd_symlinkargs {
> >> 	char *			tname;
> >> 	unsigned int		tlen;
> >> 	struct iattr		attrs;
> >> +	struct kvec		first;
> >> };
> >> 
> >> struct nfsd_readdirargs {
> >> diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h
> >> index deccf7f..2cb29e9 100644
> >> --- a/fs/nfsd/xdr3.h
> >> +++ b/fs/nfsd/xdr3.h
> >> @@ -90,6 +90,7 @@ struct nfsd3_symlinkargs {
> >> 	char *			tname;
> >> 	unsigned int		tlen;
> >> 	struct iattr		attrs;
> >> +	struct kvec		first;
> >> };
> >> 
> >> struct nfsd3_readdirargs {
> >> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> >> index d56219d..b485cd1 100644
> >> --- a/fs/nfsd/xdr4.h
> >> +++ b/fs/nfsd/xdr4.h
> >> @@ -110,6 +110,7 @@ struct nfsd4_create {
> >> 		struct {
> >> 			u32 datalen;
> >> 			char *data;
> >> +			struct kvec first;
> >> 		} link;   /* NF4LNK */
> >> 		struct {
> >> 			u32 specdata1;
> >> @@ -124,6 +125,7 @@ struct nfsd4_create {
> >> };
> >> #define cr_datalen	u.link.datalen
> >> #define cr_data		u.link.data
> >> +#define cr_first	u.link.first
> >> #define cr_specdata1	u.dev.specdata1
> >> #define cr_specdata2	u.dev.specdata2
> >> 
> >> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> >> index 238b9ae..fd5846e 100644
> >> --- a/include/linux/sunrpc/svc.h
> >> +++ b/include/linux/sunrpc/svc.h
> >> @@ -495,6 +495,8 @@ int		   svc_register(const struct svc_serv *, struct net *, const int,
> >> char *		   svc_print_addr(struct svc_rqst *, char *, size_t);
> >> unsigned int	   svc_fill_write_vector(struct svc_rqst *rqstp,
> >> 					 struct kvec *first, size_t total);
> >> +char		  *svc_fill_symlink_pathname(struct svc_rqst *rqstp,
> >> +					     struct kvec *first, size_t total);
> >> 
> >> #define	RPC_MAX_ADDRBUFLEN	(63U)
> >> 
> >> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> >> index 759b668..fc93406 100644
> >> --- a/net/sunrpc/svc.c
> >> +++ b/net/sunrpc/svc.c
> >> @@ -1578,3 +1578,70 @@ unsigned int svc_fill_write_vector(struct svc_rqst *rqstp, struct kvec *first,
> >> 	return i;
> >> }
> >> EXPORT_SYMBOL_GPL(svc_fill_write_vector);
> >> +
> >> +/**
> >> + * svc_fill_symlink_pathname - Construct pathname argument for VFS symlink call
> >> + * @rqstp: svc_rqst to operate on
> >> + * @first: buffer containing first section of pathname
> >> + * @total: total length of the pathname argument
> >> + *
> >> + * Returns pointer to a NUL-terminated string, or an ERR_PTR. The buffer is
> >> + * released automatically when @rqstp is recycled.
> >> + */
> >> +char *svc_fill_symlink_pathname(struct svc_rqst *rqstp, struct kvec *first,
> >> +				size_t total)
> >> +{
> >> +	struct xdr_buf *arg = &rqstp->rq_arg;
> >> +	struct page **pages;
> >> +	char *result;
> >> +
> >> +	/* VFS API demands a NUL-terminated pathname. This function
> >> +	 * uses a page from @rqstp as the pathname buffer, to enable
> >> +	 * direct placement. Thus the total buffer size is PAGE_SIZE.
> >> +	 * Space in this buffer for NUL-termination requires that we
> >> +	 * cap the size of the returned symlink pathname just a
> >> +	 * little early.
> >> +	 */
> >> +	if (total > PAGE_SIZE - 1)
> >> +		return ERR_PTR(-ENAMETOOLONG);
> >> +
> >> +	/* Some types of transport can present the pathname entirely
> >> +	 * in rq_arg.pages. If not, then copy the pathname into one
> >> +	 * page.
> >> +	 */
> >> +	pages = arg->pages;
> >> +	WARN_ON_ONCE(arg->page_base != 0);
> >> +	if (first->iov_base == 0) {
> >> +		result = page_address(*pages);
> >> +		result[total] = '\0';
> >> +	} else {
> >> +		size_t len, remaining;
> >> +		char *dst;
> >> +
> >> +		result = page_address(*(rqstp->rq_next_page++));
> >> +		dst = result;
> >> +		remaining = total;
> >> +
> >> +		len = min_t(size_t, total, first->iov_len);
> >> +		memcpy(dst, first->iov_base, len);
> >> +		dst += len;
> >> +		remaining -= len;
> >> +
> >> +		/* No more than one page left */
> >> +		if (remaining) {
> >> +			len = min_t(size_t, remaining, PAGE_SIZE);
> >> +			memcpy(dst, page_address(*pages), len);
> >> +			dst += len;
> >> +		}
> >> +
> >> +		*dst = '\0';
> >> +	}
> >> +
> >> +	/* Sanity check: we don't allow the pathname argument to
> >> +	 * contain a NUL byte.
> >> +	 */
> >> +	if (strlen(result) != total)
> >> +		return ERR_PTR(-EINVAL);
> >> +	return result;
> >> +}
> >> +EXPORT_SYMBOL_GPL(svc_fill_symlink_pathname);
> 
> --
> Chuck Lever
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux