On Thu, Jun 22, 2023 at 09:59:29AM +1000, Dave Chinner wrote: > Ah, you are testing pure overwrites, which means for ext4 the only > thing it needs to care about is cached mappings. What happens when > you add O_DSYNC here? I think you mean O_SYNC, right? In a pure overwrite case, where all of the extents are initialized and where the Oracle or DB2 server is doing writes to preallocated, pre-initialized space in the tablespace file followed by fdatasync(), there *are* no post-I/O data integrity operations which are required. If the file is opened O_SYNC or if the blocks were not preallocated using fallocate(2) and not initialized ahead of time, then sure, we can't use this optimization. However, the cases where databases workloads *are* doing overwrites and using fdatasync(2) most certainly do exist, and the benefit of this optimization can be a 20% throughput. Which is nothing to sneeze at. What we might to do is to let the file system tell the iomap layer via a flag whether or not there are no post-I/O metadata operations required, and then *if* that flag is set, and *if* the inode has no pages in the page cache (so there are no invalidate operations necessary), it should be safe to skip using queue_work(). That way, the file system has to affirmatively state that it is safe to skip the workqueue, so it shouldn't do any harm to other file systems using the iomap DIO layer. What am I missing? Cheers, - Ted