On Fri, Mar 29 2019, Murphy Zhou wrote: > After this commit > f875a79 nfsd: allow nfsv3 readdir request to be larger. > nfsv3 readdir request size can be larger than PAGE_SIZE. So if the > request and the directory are large enough, we can run out of pages > in rq_respages. Then the temporary page we are encoding on is a > random page, Oops happen. > > Listing a directory with 30000 files in it can trigger the panic. > > Fixing this by ensuring the temporary page resides in rq_respages. > And when counting how many bytes left currently from buffer to the > end of the page which buffer is pointing to, aka len1, do not > assume that curr_page_addr is at the beginning of the same page. > Also, update resp->count by going through all pages. Because the > pages may not be sequential, the old way is not safe. > > Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" > Signed-off-by: Murphy Zhou <jencce.kernel@xxxxxxxxx> Hi, thanks for finding these problems and submitting a patch. Clearly I should have tested better :-( > --- > fs/nfsd/nfs3proc.c | 17 +++++++++++++++-- > fs/nfsd/nfs3xdr.c | 8 +++++--- > 2 files changed, 20 insertions(+), 5 deletions(-) > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c > index 8f933e84cec1..9bc32af4e2da 100644 > --- a/fs/nfsd/nfs3proc.c > +++ b/fs/nfsd/nfs3proc.c > @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > struct nfsd3_readdirargs *argp = rqstp->rq_argp; > struct nfsd3_readdirres *resp = rqstp->rq_resp; > __be32 nfserr; > - int count; > + int count = 0; > + struct page **p; > + caddr_t page_addr = NULL; > > dprintk("nfsd: READDIR(3) %s %d bytes at %d\n", > SVCFH_fmt(&argp->fh), > @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, > &resp->common, nfs3svc_encode_entry); > memcpy(resp->verf, argp->verf, 8); > - resp->count = resp->buffer - argp->buffer; > + count = 0; > + for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) { > + page_addr = page_address(*p); > + > + if (((caddr_t)resp->buffer >= page_addr) && > + ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) { > + count += (caddr_t)resp->buffer - page_addr; > + break; > + } > + count += PAGE_SIZE; > + } > + resp->count = count >> 2; > if (resp->offset) { > loff_t offset = argp->cookie; This section makes perfect sense. You have copied code from nfsd3_proc_readdirplus() int nfsd3_proc_readdir(). This is needed because readdir doesn't limit replies to PAGE_SIZE any more. > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c > index 93fea246f676..1fabf1952bdb 100644 > --- a/fs/nfsd/nfs3xdr.c > +++ b/fs/nfsd/nfs3xdr.c > @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > break; > } > > + Adding blank lines is best avoided. > if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) { > /* encode entry in current page */ > > @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > if (plus) > p = encode_entryplus_baggage(cd, p, name, namlen, ino); > num_entry_words = p - cd->buffer; > - } else if (*(page+1) != NULL) { > + } else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) { Adding the test against rq_next_page looks correct. nfs3svc_decode_readdirplusargs increments ->rq_next_page over enough pages to cover the request count. However nfs4svc_decode_readdirargs doesn't! So it will not advance rq_next_page properly. We need to fix that to make it like nfs3svc_decode_readdirplusargs(). But I don't think this test should be needed. svc_alloc_arg() NULL- terminates the ->rq_pages array which ->rq_next_page points in to. So I think this test is correct as it stands, but nfs4svc_decode_readdirargs() needs to be fixed. > /* temporarily encode entry into next page, then move back to > * current and next page in rq_respages[] */ > __be32 *p1, *tmp; > int len1, len2; > + caddr_t tmp_page_addr = NULL; > > /* grab next page for temporary storage of entry */ > p1 = tmp = page_address(*(page+1)); > @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > /* determine entry word length and lengths to go in pages */ > num_entry_words = p1 - tmp; > - len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > + tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK); > + len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; I think tmp_page_addr will always be the same as curr_page_addr. At least, it will when nfs4svc_decode_readdirargs() is fixed. Could you please revert the changed you've made to nfs3xdr.c (keeping the change to nfsd3_proc_readdir()), and fix nfs4svc_decode_readdirargs to be more like nfs4svc_decode_readdirplusargs, and see if that fixed the problem? Thanks a lot, NeilBrown > if ((num_entry_words << 2) < len1) { > /* the actual number of words in the entry is less > * than elen and can still fit in the current page > @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > cd->buffer = p; > cd->common.err = nfs_ok; > return 0; > - > } > > int > -- > 2.21.0
Attachment:
signature.asc
Description: PGP signature