Hi all, this series overhaults a large chunk of the iomap writeback code with the end result that ->map_blocks can now map multiple blocks at a time, at least as long as they are all inside the same folio. On a sufficiently large system (32 cores in my case) this significantly reduces CPU usage for buffered write workloads on xfs, with a very minor improvement in write bandwith that might be within the measurement tolerance. e.g. on a fio sequential write workload using io_uring I get these values (median out of 5 runs): before: cpu : usr=5.26%, sys=4.81%, ctx=4009750, majf=0, minf=13 WRITE: bw=1096MiB/s (1150MB/s), 1096MiB/s-1096MiB/s (1150MB/s-1150MB/s), io=970GiB (1042GB), run=906036-906036msec with this series: cpu : usr=4.95%, sys=2.72%, ctx=4084578, majf=0, minf=12 WRITE: bw=1111MiB/s (1165MB/s), 1111MiB/s-1111MiB/s (1165MB/s-1165MB/s), io=980GiB (1052GB), run=903234-903234msec On systems with a small number of cores the cpu usage reduction is much lower and barely visible. Changes since RFC: - various commit message typo fixes - minor formatting fixes - keep the PF_MEMALLOC check and move it to iomap_writepages - rename the offset argument to iomap_can_add_to_ioend to pos - fix missing error handling in an earlier patch (only required for bisection, no change to the end result) - remove a stray whitespace - refactor ifs_find_dirty_range a bit to make it more readable - add a patch to pass the dirty_len to the file system to make life for ext2 easier Diffstat: block/fops.c | 2 fs/gfs2/bmap.c | 2 fs/iomap/buffered-io.c | 576 +++++++++++++++++++++++-------------------------- fs/xfs/xfs_aops.c | 9 fs/zonefs/file.c | 3 include/linux/iomap.h | 19 + 6 files changed, 306 insertions(+), 305 deletions(-)