On Wed, Dec 06, 2023 at 11:00:32AM +0000, David Howells wrote: > Hi, > > I've been debugging apparent cifs failures with xfstests, in particular > generic/009, and I'm finding that the tests are failing because FIEMAP is not > returning exactly the expected extent map. > > The problem is that the FSCTL_QUERY_ALLOCATED_RANGES smb RPC op can only > return a list of ranges that are allocated and does not return any other > information about those allocations or the gaps between them - and thus FIEMAP > cannot express this information to the extent that the test expects. <shrug> Perhaps that simply makes FSCTL_QUERY_ALLOCATED_RANGES -> FIEMAP translation a poor choice? FIEMAP doesn't have a way to say "written status unknown". > Further, as Steve also observed, the expectation that the individual subtests > should return exactly those ranges is flawed. The filesystem is at liberty to > split extents, round up extents, bridge extents and automatically punch out > blocks of zeros. xfstests/common/punch allows for some of this, but I wonder > if it needs to be more fuzzy. > > I wonder if the best xfstests can be expected to check is that the data we > have written is within the allocated regions. I think the only expectation that generic/shared tests can have is that file ranges they've written must not be reported as SEEK_HOLE. The ranges reported by SEEK_DATA must include the file ranges written by application software, but the data ranges can be encompass more range than that. > Which brings me on to FALLOC_FL_ZERO_RANGE - is this guaranteed to result in > an allocated region (if successful)? Yes, that's the distinction between ZERO and PUNCH. > Samba is translating FSCTL_SET_ZERO_DATA > to FALLOC_FL_PUNCH_HOLE, as is ksmbd, and then there is no allocated range to What does the FSCTL_SET_ZERO_DATA documentation say about the state of the file range after a successful operation? Oh. Heh. According to: https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_set_zero_data "If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to a sparse file and the zero (0) region is large enough, the file system may not allocate disk space. "If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to a non-sparse file, zeros (0) are written to the file. The system allocates disk storage for all of the zero (0) range, which is equivalent to using the WriteFile function to write zeros (0) to a file. > report back (Samba and ksmbd use SEEK_HOLE/SEEK_DATA rather than FIEMAP - > would a ZERO_RANGE even show up with that?). That depends on the local disk's implementation of lseek and ZERO_RANGE. XFS, for example, implements ZERO_RANGE by unmapping the entire range and then reallocating it with an unwritten extent. There's no reason why it couldn't also issue a WRITE_SAME to storage and change the mapping state to written. The user-visible behavior would be the same (reads return zeroes, space is allocated). However. XFS' SEEK_DATA implementation (aka iomap's) skips over parts of unwritten extents if there isn't a folio in the page cache. If some day the implementation were adjusted to do that WRITE_SAME thing I mentioned, then SEEK_DATA would return the entire range as data regardless of pagecache state. This difference between SEEK_DATA and FIEMAP has led to data corruption problems in the past, because unwritten extents as reported by FIEMAP can have dirty page cache sitting around. SEEK_DATA reports the dirty pages as data; FIEMAP is silent. > Finally, should the Linux cifs filesystem translate gaps in the result of > FSCTL_QUERY_ALLOCATED_RANGES into 'unwritten' extents rather than leaving them > as gaps in the list (to be reported as holes by xfs_io)? This smacks a bit of > adjusting things for the sake of making the testsuite work when the testsuite > isn't quite compatible with the thing being tested. That doesn't make sense to me. > So: > > - Should Samba and ksmbd be using FALLOC_FL_ZERO_RANGE rather than > PUNCH_HOLE? Probably depends on whether or not they present unix files as sparse or non-sparse to Windows? > - Should Samba and ksmbd be using FIEMAP rather than SEEK_DATA/HOLE? No. > - Should xfstests be less exacting in its FIEMAP analysis - or should this be > skipped for cifs? I don't want to skip generic/009 as it checks some > corner cases that need testing, but it may not be possible to make the > exact extent matching work. It's a big lift but I think the generic fstests need to be reworked to FIEMAP-check only the file ranges that it actually wrote. Those can't be SEEK_HOLEs. --D > > Thanks, > David > > >