On Sat, Nov 05, 2022 at 10:10:36AM +1100, Dave Chinner wrote: > On Fri, Nov 04, 2022 at 01:08:50AM -0700, Christoph Hellwig wrote: > > So, the whole scan for delalloc logic seems pretty generic, I think > > it can an should be lifted to iomap, with > > xfs_buffered_write_delalloc_punch provided as a callback. > > Maybe. When we get another filesystem that has the same problem with > short writes needing to punch delalloc extents, we can look at > lifting it into the generic code. But until then, it is exclusively > an XFS issue... > > > As for the reuse of the seek hole / data helpers, and I'm not sure > > this actually helps all that much, and certainly is not super > > efficient. I don't want you to directly talk into rewriting this > > once again, but a simple > > [snip] > > I started with the method you are suggesting, and it took me 4 weeks > of fighting with boundary condition bugs before I realised there was > a better way. > > Searching for sub-folio discontiguities is highly inefficient > however you look at it - we have to scan dirty folios block by block > determine the uptodate state of each block. We can't do a range scan > because is_partially_uptodate() will return false if any block > within the range is not up to date. Hence we have to iterate one > block at a time to determine the state of each block, and that > greatly complicates things. This sounds like a neat optimization for seek hole/data, but that's an optimization that can be deferred to another cleanup. As it is, this fix patchset already introduces plenty to think about. --D > i.e. we now have range boundarys at the edges of the write() op, > range boundaries at the edges of filesysetm blocks, and range > boundaries at unpredictable folio_size() edges. I couldn't keep all > this straight in my head - I have to be able to maintain and debug > this code, so if I can't track all the edge cases in my head, I sure > as hell can't debug the code, nor expect to understand it when I > next look at it in a few months time. > Just because one person is smart enough to be able to write code > that uses multiply-nested range iterations full of boundary > conditions that have to be handled correctly, it doesn't mean that > it is the best way to write slow-path/error handling code that *must > be correct*. The best code is the code that anyone can understand > and say "yes, that is correct". > > So, yes, using the seek hole / data helpers might be a tiny bit more > code, but compactness, efficiency and speed really don't matter. > What matters is that the code is correct and that the people who > need to track down the bugs and data corruptions in this code are > able to understand and debug the code. i.e. to make the code > maintainable we need to break the complex problems down into > algorithms and code that can be understood and debugged by anyone, > not just the smartest person in the room. > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx