Hi, Thanks for the review! On Mon, Apr 1, 2019 at 9:20 AM NeilBrown <neilb@xxxxxxxx> wrote: > > On Fri, Mar 29 2019, Murphy Zhou wrote: > > > After this commit > > f875a79 nfsd: allow nfsv3 readdir request to be larger. > > nfsv3 readdir request size can be larger than PAGE_SIZE. So if the > > request and the directory are large enough, we can run out of pages > > in rq_respages. Then the temporary page we are encoding on is a > > random page, Oops happen. > > > > Listing a directory with 30000 files in it can trigger the panic. > > > > Fixing this by ensuring the temporary page resides in rq_respages. > > And when counting how many bytes left currently from buffer to the > > end of the page which buffer is pointing to, aka len1, do not > > assume that curr_page_addr is at the beginning of the same page. > > Also, update resp->count by going through all pages. Because the > > pages may not be sequential, the old way is not safe. > > > > Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" > > Signed-off-by: Murphy Zhou <jencce.kernel@xxxxxxxxx> > > Hi, > thanks for finding these problems and submitting a patch. Clearly I > should have tested better :-( > > > --- > > fs/nfsd/nfs3proc.c | 17 +++++++++++++++-- > > fs/nfsd/nfs3xdr.c | 8 +++++--- > > 2 files changed, 20 insertions(+), 5 deletions(-) > > > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c > > index 8f933e84cec1..9bc32af4e2da 100644 > > --- a/fs/nfsd/nfs3proc.c > > +++ b/fs/nfsd/nfs3proc.c > > @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > > struct nfsd3_readdirargs *argp = rqstp->rq_argp; > > struct nfsd3_readdirres *resp = rqstp->rq_resp; > > __be32 nfserr; > > - int count; > > + int count = 0; > > + struct page **p; > > + caddr_t page_addr = NULL; > > > > dprintk("nfsd: READDIR(3) %s %d bytes at %d\n", > > SVCFH_fmt(&argp->fh), > > @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > > nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, > > &resp->common, nfs3svc_encode_entry); > > memcpy(resp->verf, argp->verf, 8); > > - resp->count = resp->buffer - argp->buffer; > > + count = 0; > > + for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) { > > + page_addr = page_address(*p); > > + > > + if (((caddr_t)resp->buffer >= page_addr) && > > + ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) { > > + count += (caddr_t)resp->buffer - page_addr; > > + break; > > + } > > + count += PAGE_SIZE; > > + } > > + resp->count = count >> 2; > > if (resp->offset) { > > loff_t offset = argp->cookie; > > This section makes perfect sense. You have copied code from > nfsd3_proc_readdirplus() int nfsd3_proc_readdir(). This is needed > because readdir doesn't limit replies to PAGE_SIZE any more. > > > > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c > > index 93fea246f676..1fabf1952bdb 100644 > > --- a/fs/nfsd/nfs3xdr.c > > +++ b/fs/nfsd/nfs3xdr.c > > @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > break; > > } > > > > + > > Adding blank lines is best avoided. > > > if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) { > > /* encode entry in current page */ > > > > @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > if (plus) > > p = encode_entryplus_baggage(cd, p, name, namlen, ino); > > num_entry_words = p - cd->buffer; > > - } else if (*(page+1) != NULL) { > > + } else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) { > > Adding the test against rq_next_page looks correct. > nfs3svc_decode_readdirplusargs increments ->rq_next_page over enough > pages to cover the request count. > However nfs4svc_decode_readdirargs doesn't! So it will not advance > rq_next_page properly. We need to fix that to make it like > nfs3svc_decode_readdirplusargs(). > > But I don't think this test should be needed. svc_alloc_arg() NULL- > terminates the ->rq_pages array which ->rq_next_page points in to. > > So I think this test is correct as it stands, but > nfs4svc_decode_readdirargs() needs to be fixed. Agreed. If we set the page pointers right. These checks wont be necessary. > > > /* temporarily encode entry into next page, then move back to > > * current and next page in rq_respages[] */ > > __be32 *p1, *tmp; > > int len1, len2; > > + caddr_t tmp_page_addr = NULL; > > > > /* grab next page for temporary storage of entry */ > > p1 = tmp = page_address(*(page+1)); > > @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > > > /* determine entry word length and lengths to go in pages */ > > num_entry_words = p1 - tmp; > > - len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > > + tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK); > > + len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > > I think tmp_page_addr will always be the same as curr_page_addr. > At least, it will when nfs4svc_decode_readdirargs() is fixed. > > Could you please revert the changed you've made to nfs3xdr.c (keeping > the change to nfsd3_proc_readdir()), and fix nfs4svc_decode_readdirargs > to be more like nfs4svc_decode_readdirplusargs, and see if that fixed > the problem? Sure. A quick test on 3000 files passed. I'll post v2 if test on 30,000 files pass. Thanks very much for reviewing! M > > Thanks a lot, > NeilBrown > > > > if ((num_entry_words << 2) < len1) { > > /* the actual number of words in the entry is less > > * than elen and can still fit in the current page > > @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > cd->buffer = p; > > cd->common.err = nfs_ok; > > return 0; > > - > > } > > > > int > > -- > > 2.21.0