On Thu, Nov 23, 2023 at 2:14 PM Cedric Blancher <cedric.blancher@xxxxxxxxx> wrote: > > On Thu, 23 Nov 2023 at 00:19, Rick Macklem <rick.macklem@xxxxxxxxx> wrote: > > > > On Wed, Nov 22, 2023 at 2:48 PM Cedric Blancher > > <cedric.blancher@xxxxxxxxx> wrote: > > > > > > On Sun, 19 Nov 2023 at 19:02, Anna Schumaker <schumaker.anna@xxxxxxxxx> wrote: > > > > > > > > On Sun, Nov 19, 2023 at 12:59 PM Cedric Blancher > > > > <cedric.blancher@xxxxxxxxx> wrote: > > > > > > > > > > On Sun, 19 Nov 2023 at 18:48, Anna Schumaker <schumaker.anna@xxxxxxxxx> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > On Sun, Nov 19, 2023 at 12:38 PM Cedric Blancher > > > > > > <cedric.blancher@xxxxxxxxx> wrote: > > > > > > > > > > > > > > Good evening! > > > > > > > > > > > > > > How does READ_PLUS differ from READ? Has anyone made a simpler > > > > > > > presentation (PowerPoint slides) than the RFCs? > > > > > > > > > > > > No slides, but at a high level READ_PLUS can compress out long ranges > > > > > > of zeroes in a read reply by returning a HOLE segment instead of the > > > > > > actual zeroes. It's perfectly valid for the server to skip the zero > > > > > > detection and return everything as a data segment, however. > > > > > > > > > > So how do you differ between > > > > > 1. a hole, aka no filesystem blocks allocated > > > > > 2. a long sequence of valid data with all zero bytes in them > > > > > > > > That's up to the server! It could use something like fiemap or lseek > > > > with SEEK_HOLE or SEEK_DATA. It could also scan the data to see if > > > > there are any zeroes that could be compressed out. > > > > > > How can the client figure out whether the data in a READ_PLUS reply > > > are zeros of data, or zeros from a hole? > > As I understand the RFC, it cannot. Or put another way "a hole is a > > region that reads as all 0s, which may or may not have allocated blocks > > on the server file system". > > > > Although SEEK_HOLE typically returns the offset of an unallocated > > region, I don't think either the POSIX draft (was it ever ratified?) nor > > RFC7862 actually define a "hole" as an unallocated region. > > Opengroup ratified that one. See https://austingroupbugs.net/view.php?id=415 > > > > > On a similar vein, Deallocate can simply write 0s to the region. > > (It does not actually have to "deallocate data blocks".) > > > > At least that is my understanding of POSIX and RFC7862, rick > > Can anyone please confirm that RFC7862 and READPLUS cannot distinguish > between allocated and unallocated regions in a file? The best place to ask this is the nfsv4@xxxxxxxx mailing list. Alternately, you just read the words yourself... Having said that, here are a few snippets of RFC7862 (neither of which are in the READ_PLUS section): In definitions... Hole: A byte range within a sparse file that contains all zeros. A hole might or might not have space allocated or reserved to it. And in the section on DEALLOCATE... All further READs from the region passed to DEALLOCATE MUST return zeros until overwritten. [irrelevant stuff snipped] Situations may arise where da_offset and/or da_offset + da_length will not be aligned to a boundary for which the server does allocations or deallocations. For most file systems, this is the block size of the file system. In such a case, the server can deallocate as many bytes as it can in the region. The blocks that cannot be deallocated MUST be zeroed. Now, if the above is not enough to convince you that "hole" does not necessarily imply "unallocated", then I suggest you read it and then ask on nfsv4@xxxxxxxx. (Btw, the DEALLOCATE section uses the term "unreserved" and not "unallocated".) I'll also admit I do not understand why you care? Is there a Windows API that specifically returns unallocated regions of files? rick > > Ced > -- > Cedric Blancher <cedric.blancher@xxxxxxxxx> > [https://plus.google.com/u/0/+CedricBlancher/] > Institute Pasteur >