> On Sep 4, 2020, at 10:03 AM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Fri, Sep 04, 2020 at 09:56:19AM -0400, Chuck Lever wrote: >> >> >>> On Sep 4, 2020, at 9:52 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: >>> >>> On Tue, Sep 01, 2020 at 03:18:54PM -0400, J. Bruce Fields wrote: >>>> On Tue, Sep 01, 2020 at 01:40:16PM -0400, Anna Schumaker wrote: >>>>> On Tue, Sep 1, 2020 at 12:49 PM J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: >>>>>> >>>>>> On Mon, Aug 31, 2020 at 02:16:26PM -0400, Anna Schumaker wrote: >>>>>>> On Fri, Aug 28, 2020 at 5:56 PM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: >>>>>>>> We really don't want to bother encoding small holes. I doubt >>>>>>>> filesystems want to bother with them either. Do they give us any >>>>>>>> guarantees as to the minimum size of a hole? >>>>>>> >>>>>>> The minimum size seems to be PAGE_SIZE from everything I've seen. >>>>>> >>>>>> OK, can we make that assumption explicit? It'd simplify stuff like >>>>>> this. >>>>> >>>>> I'm okay with that, but it's technically up to the underlying filesystem. >>>> >>>> Maybe we should ask on linux-fsdevel. >>>> >>>> Maybe minimum hole length isn't the right question: suppose at time 1 a >>>> file has a single hole at bytes 100-200, then it's modified so at time 2 >>>> it has a hole at bytes 50-150. If you lseek(fd, 0, SEEK_HOLE) at time >>>> 1, you'll get 100. Then if you lseek(fd, 100, SEEK_DATA) at time 2, >>>> you'll get 150. So you'll encode a 50-byte hole in the READ_PLUS reply >>>> even though the file never had a hole smaller than 100 bytes. >>>> >>>> Minimum hole alignment might be the right idea. >>>> >>>> If we can't get that: maybe just teach encode_read to stop when it >>>> *either* returns maxcount worth of file data (and holes) *or* maxcount >>>> of encoded xdr data, just to prevent a weird filesystem from triggering >>>> a bug. >>> >>> Alternatively, if it's easier, we could enforce a minimum alignment by >>> rounding up the result of SEEK_HOLE to the nearest multiple of (say) 512 >>> bytes, and rounding down the result of SEEK_DATA. >> >> Perhaps it goes without saying, but is there an effort to >> ensure that the set of holes is represented in exactly the >> same way when accessing a file via READ_PLUS and >> SEEK_DATA/HOLE ? > > So you're thinking of something like a pynfs test that creates a file > with holes and then tries reading through it with READ_PLUS and SEEK and > comparing the results? I hadn't considered a particular test platform, but yes, a regression test like that would be appropriate. > There are lots of legitimate reasons that test might "fail"--servers > aren't required to support holes at all, and have a lot of lattitude > about how to report them. Agreed that the test would need to account for server support for holes. > But it might be a good idea to test anyway. My primary concern is that the result of a file copy operation should look the same on NFS/TCP (with READ_PLUS) and NFS/RDMA (with SEEK_DATA/HOLE). -- Chuck Lever