Re: [PATCH 4/7] xfs: buffered write failure should not truncate the page cache

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 3 Nov 2022 07:57:07 +1100

On Wed, Nov 02, 2022 at 12:24:42AM -0700, Christoph Hellwig wrote:
> Just a little dumb question while I'm wrapping my head around the change
> here - why do we even punch the pagecache to start with?

We don't. It's just wrong because it assumes that the write() owns
the range of the page cache and it only contains no non-zero data
because the write allocated the delalloc range and therefore the
write was into a hole and therefore, by definition, it contains no
non-zero data.

Hence if the write is short, we punched out the page cache under the
assumption that it will only remove cached zeroes from the cache. If
those zeroes are dirty for some reason (zeroing prior to the iomap
hole/unwritten detection?) we don't need to write them and have to
be removed from the page caceh before we punch out the underlying
delalloc extent.

Unfortunately, this assumption has always been compeltely invalid
because both writeback and mmap page faults access to the page cache
can race with write()...

> As long as the
> regions that we failed to write to aren't marked uptdate (at the page
> or block level for sub-block writes), who cares if they remain in the
> page cache for now?

Exactly - that's the premise this patchset is based on - we only
need to care about dirty pages across the range of the delalloc
extent, and nothing else matters as it will be properly instantiated
with new delalloc backing if it gets dirtied in future...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx