Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap()

Dave Chinner <david@xxxxxxxxxxxxx> · Sat, 27 Jan 2024 09:32:42 +1100

On Fri, Jan 26, 2024 at 02:54:24AM +0000, Matthew Wilcox wrote:
> On Fri, Jan 26, 2024 at 12:22:32PM +1100, Dave Chinner wrote:
> > On Thu, Jan 25, 2024 at 07:19:45PM +0900, Namjae Jeon wrote:
> > > We need to consider the case that mmap against files with different
> > > valid size and size created from Windows. So it needed to zero out in mmap.
> > 
> > That's a different case - that's a "read from a hole" case, not a
> > "extending truncate" case. i.e. the range from 'valid size' to EOF
> > is a range where no data has been written and so contains zeros.
> > It is equivalent to either a hole in the file (no backing store) or
> > an unwritten range (backing store instantiated but marked as
> > containing no valid data).
> > 
> > When we consider this range as "reading from a hole/unwritten
> > range", it should become obvious the correct way to handle this case
> > is the same as every other filesystem that supports holes and/or
> > unwritten extents: the page cache page gets zeroed in the
> > readahead/readpage paths when it maps to a hole/unwritten range in
> > the file.
> > 
> > There's no special locking needed if it is done this way, and
> > there's no need for special hooks anywhere to zero data beyond valid
> > size because it is already guaranteed to be zeroed in memory if the
> > range is cached in the page cache.....
> 
> but the problem is that Microsoft half-arsed their support for holes.
> See my other mail in this thread.

Why does that matter?  It's exactly the same problem with any other
filesytsem that doesn't support sparse files.

All I said is that IO operations beyond the "valid size" should
be treated like a operating in a hole - I pass no judgement on the
filesystem design, implementation or level of sparse file support
it has. ALl it needs to do is treat the "not valid" size range as if
it was a hole or unwritten, regardless of whether the file is sparse
or not....

> truncate the file up to 4TB
> write a byte at offset 3TB
> 
> ... now we have to stream 3TB of zeroes through the page cache so that
> we can write the byte at 3TB.

This behaviour cannot be avoided on filesystems without sparse file
support - the hit of writing zeroes has to be taken somewhere. We
can handle this in truncate(), the write() path or in ->page_mkwrite
*if* the zeroing condition is hit.  There's no need to do it at
mmap() time if that range of the file is not actually written to by
the application...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx