Re: [PATCH 4/7] xfs: buffered write failure should not truncate the page cache

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Mon, 7 Nov 2022 15:48:19 -0800

On Sat, Nov 05, 2022 at 10:10:36AM +1100, Dave Chinner wrote:
> On Fri, Nov 04, 2022 at 01:08:50AM -0700, Christoph Hellwig wrote:
> > So, the whole scan for delalloc logic seems pretty generic, I think
> > it can an should be lifted to iomap, with
> > xfs_buffered_write_delalloc_punch provided as a callback.
> 
> Maybe. When we get another filesystem that has the same problem with
> short writes needing to punch delalloc extents, we can look at
> lifting it into the generic code. But until then, it is exclusively
> an XFS issue...
> 
> > As for the reuse of the seek hole / data helpers, and I'm not sure
> > this actually helps all that much, and certainly is not super
> > efficient.  I don't want you to directly talk into rewriting this
> > once again, but a simple
> 
> [snip]
> 
> I started with the method you are suggesting, and it took me 4 weeks
> of fighting with boundary condition bugs before I realised there was
> a better way.
> 
> Searching for sub-folio discontiguities is highly inefficient
> however you look at it - we have to scan dirty folios block by block
> determine the uptodate state of each block. We can't do a range scan
> because is_partially_uptodate() will return false if any block
> within the range is not up to date.  Hence we have to iterate one
> block at a time to determine the state of each block, and that
> greatly complicates things.

This sounds like a neat optimization for seek hole/data, but that's an
optimization that can be deferred to another cleanup.  As it is, this
fix patchset already introduces plenty to think about.

--D

> i.e. we now have range boundarys at the edges of the write() op,
> range boundaries at the edges of filesysetm blocks, and range
> boundaries at unpredictable folio_size() edges. I couldn't keep all
> this straight in my head - I have to be able to maintain and debug
> this code, so if I can't track all the edge cases in my head, I sure
> as hell can't debug the code, nor expect to understand it when I
> next look at it in a few months time.

> Just because one person is smart enough to be able to write code
> that uses multiply-nested range iterations full of boundary
> conditions that have to be handled correctly, it doesn't mean that
> it is the best way to write slow-path/error handling code that *must
> be correct*. The best code is the code that anyone can understand
> and say "yes, that is correct".
> 
> So, yes, using the seek hole / data helpers might be a tiny bit more
> code, but compactness, efficiency and speed really don't matter.
> What matters is that the code is correct and that the people who
> need to track down the bugs and data corruptions in this code are
> able to understand and debug the code.  i.e. to make the code
> maintainable we need to break the complex problems down into
> algorithms and code that can be understood and debugged by anyone,
> not just the smartest person in the room.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx