> On Mar 3, 2021, at 11:52 AM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Wed, Mar 03, 2021 at 03:43:28PM +0000, Chuck Lever wrote: >> Hi Bruce- >> >> Thanks for your careful review of this series! >> >> >>> On Mar 2, 2021, at 5:11 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: >>> >>> On Mon, Mar 01, 2021 at 10:17:13AM -0500, Chuck Lever wrote: >>>> The description of commit 2825a7f90753 ("nfsd4: allow encoding >>>> across page boundaries") states: >>>> >>>>> Also we can't handle a new operation starting close to the end of >>>>> a page. >>>> >>>> But does not detail why this is the case. >>> >>> That wasn't every helpful of me, sorry. >>> >>>> Subtracting the scratch buffer's "shift" value from the remaining >>>> stream space seems to make reserving space close to the end of the >>>> buf->pages array reliable. >>> >>> So, why is that? >>> >>> Thinking this through: >>> >>> When somebody asks for a buffer that would straddle a page boundary, >>> with frag1bytes at the end of this page and frag2bytes at the start of >>> the next page, we instead give them a buffer starting at start of the >>> next page. That gives them a nice contiguous buffer to write into. >>> When they're done using it, we fix things up by copying what they wrote >>> back to where it should be. >>> >>> That means we're temporarily wasting frag1bytes of space. So, I don't >>> know, maybe that's the logic behind subtracing frag1bytes from >>> space_left. >>> >>> It means you may end up with xdr->end frag1bytes short of the next page. > > Wait, let me try that again: > > p = page_address(*xdr->page_ptr); > xdr->p = (void *)p + frag2bytes; > space_left = xdr->buf->buflen - xdr->buf->len - frag1bytes; > xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE); > > If you've still got a lot of buffer space left, then that'll put > xdr->end frag2bytes past the end of a page, won't it? So far I haven't heard any complaints from the kernel. I'll instrument the code and see what comes up. >>> I'm not sure that's right. >> >> Why would that not be OK? the next call to xdr_get_next_encode_buffer() >> should do the right thing and bounce the new encoded data from the >> next page into this one again. >> >> So far I have not encountered any problems. Would such a problem show >> up with some frequency under normal use, or would it be especially >> subtle? > > I mainly just want to make sure we've got a coherent idea what this code > is doing.... Agreed, that's a good thing. > For testing: large replies that aren't just read data are readdir and > getacl. So reading large directories with lots of variably-named files > might be good. Also pynfs could easily send a single compound with lots > of variable-sized reads, that might be interesting. I've run the full git regression suite over NFSv3 many many times with this patch applied. That includes unpacking via tar, a full build from scratch, and then running thousands of individual tests. So that doesn't exercise a particular corner case, but it does reflect a broad variety of directory operations. > Constructing a compound that will result in xdr_reserve_space calls that > exactly hit the various corner cases may be hard. But maybe if we just > send a bunch of compounds that vary over some range we can still > guarantee hitting those cases. > > --b. -- Chuck Lever