On Fri, Nov 15, 2024 at 07:01:58AM +1100, Dave Chinner wrote: > On Thu, Nov 14, 2024 at 01:04:31PM -0500, Brian Foster wrote: > > On Thu, Nov 14, 2024 at 10:34:26AM +0800, Long Li wrote: > > > On Wed, Nov 13, 2024 at 11:13:49AM -0500, Brian Foster wrote: > > ISTM that for the above merge scenario to happen we'd either need > > writeback of the thread 1 write to race just right with the thread 2 > > write, or have two writeback cycles where the completion of the first is > > still pending by the time the second completes. Either of those seem far > > less likely than either writeback seeing i_size == 8k from the start, or > > the thread 2 write completing sometime after the thread 1 ioend has > > already been completed. Hm? > > I think that this should be fairly easy to trigger with concurrent > sub-block buffered writes to O_DSYNC|O_APPEND opened files. The fact > we drop the IOLOCK before calling generic_write_sync() to flush the > data pretty much guarantees that there will be IO in flight whilst > other write() calls have extended the file in memory and are then > waiting for the current writeback on the folio to complete before > submitting their own writeback IO. > Hmmm.. that kind of sounds like opposite to me. Note that the example given was distinctly not append only, as that allows multiple technically "merge-worthy" ioends to be in completion at the same time. If you're doing O_DSYNC|O_APPEND sub-block buffered writes, then by definition there is going to be folio overlap across writes, and we don't submit a dirty&&writeback folio for writeback until preexisting writeback state clears. So ISTM that creates a situation where even if the append I/O is multithreaded, you'll just end up more likely to lockstep across writebacks and won't ever submit the second ioend before the first completes. Hm? That said, we're kind of getting in the weeds here.. I don't think it matters that much if these ioends merge.. I'm basically throwing out the proposition that maybe it's not worth the additional code to ensure that they always do. I.e., suppose we opted to trim the io_size when appropriate based on i_size so the completion side size update is accurate, but otherwise just left off the rounding helper thing and let the existing code work as it is. Would that lack of guaranteed block alignment have any practical impact on functionality or performance? ISTM the add_to_ioend() part is superfluous so it probably wouldn't have much effect on I/O sizes, if any. The skipped merge example Long Li gives seems technically possible, but I'm not aware of a workload where that would occur frequently enough that failing to merge such an ioend would have a noticeable impact.. hm? Again.. not something I care tremendously about, just trying to keep things simple if possible. If it were me and there's not something we can put to that obviously breaks, I'd start with that and then if that does prove wrong, it's obviously simple to fix by tacking on the rounding thing later. Just my .02. Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >