Christoph Hellwig <hch@xxxxxxxxxxxxx> writes: > On Tue, Nov 21, 2023 at 12:35:20AM +0530, Ritesh Harjani (IBM) wrote: >> - mmap path of ext2 continues to use generic_file_vm_ops > > So this means there are not space reservations taken at write fault > time. Yes. > While iomap was written with the assumption those exist, I can't > instantly spot anything that relies on them - you are just a lot more > likely to hit an -ENOSPC from ->map_blocks now. Which is also true with existing code no? If the block reservation is not done at the write fault, writeback is likely to fail due to ENOSPC? > Maybe we should add > an xfstests that covers the case where we use up more then the existing > free space through writable mmaps to ensure it fully works (where works > means at least as good as the old behavior)? > Sure, I can try write an fstests for the same. Also I did find an old thread which tried implementing ->page_mkwrite in ext2 [1]. However, it was rejected with following reason - """ Allocating blocks on write fault has the unwelcome impact that the file layout is likely going to be much worse (much more fragmented) - I remember getting some reports about performance regressions from users back when I did a similar change for ext3. And so I'm reluctant to change behavior of such an old legacy filesystem as ext2... """ https://lore.kernel.org/linux-ext4/20210105175348.GE15080@xxxxxxxxxxxxxx/ >> +static ssize_t ext2_buffered_write_iter(struct kiocb *iocb, >> + struct iov_iter *from) >> +{ >> + ssize_t ret = 0; >> + struct inode *inode = file_inode(iocb->ki_filp); >> + >> + inode_lock(inode); >> + ret = generic_write_checks(iocb, from); >> + if (ret > 0) >> + ret = iomap_file_buffered_write(iocb, from, &ext2_iomap_ops); >> + inode_unlock(inode); >> + if (ret > 0) >> + ret = generic_write_sync(iocb, ret); >> + return ret; >> +} > > Not for now, but if we end up doing a bunch of conversation of trivial > file systems, it might be worth to eventually add a wrapper for the > above code with just the iomap_ops passed in. Only once we see a few > duplicates, though. > Sure, make sense. Thanks! I can try and check if the the wrapper helps. >> +static int ext2_write_map_blocks(struct iomap_writepage_ctx *wpc, >> + struct inode *inode, loff_t offset) >> +{ >> + if (offset >= wpc->iomap.offset && >> + offset < wpc->iomap.offset + wpc->iomap.length) >> + return 0; >> + >> + return ext2_iomap_begin(inode, offset, inode->i_sb->s_blocksize, >> + IOMAP_WRITE, &wpc->iomap, NULL); >> +} > > Looking at gfs2 this also might become a pattern. But I'm fine with > waiting for now. > ok. >> @@ -1372,7 +1428,7 @@ void ext2_set_file_ops(struct inode *inode) >> if (IS_DAX(inode)) >> inode->i_mapping->a_ops = &ext2_dax_aops; >> else >> - inode->i_mapping->a_ops = &ext2_aops; >> + inode->i_mapping->a_ops = &ext2_file_aops; >> } > > Did yo run into issues in using the iomap based aops for the other uses > of ext2_aops, or are just trying to address the users one at a time? There are problems for e.g. for dir type in ext2. It uses the pagecache for dir. It uses buffer_heads and attaches them to folio->private. ...it uses block_write_begin/block_write_end() calls. Look for ext4_make_empty() -> ext4_prepare_chunk -> block_write_begin(). Now during sync/writeback of the dirty pages (ext4_handle_dirsync()), we might take a iomap writeback path (if using ext2_file_aops for dir) which sees folio->private assuming it is "struct iomap_folio_state". And bad things will happen... Now we don't have an equivalent APIs in iomap for block_write_begin()/end() which the users can call for. Hence, Jan suggested to lets first convert ext2 regular file path to iomap as an RFC. I shall now check for the dir type to see what changes we might require in ext2/iomap code. Thanks again for your review! -ritesh