Some of the higher layers like iomap takes inode_lock() when calling generic_write_sync(). Also writeback already happens from other paths without inode lock, so it's difficult to say that we really need sync_mapping_buffers() to take any inode locking here. Having said that, let's add generic_buffer_fsync() implementation in buffer.c with no inode_lock/unlock() for now so that filesystems like ext2 and ext4's nojournal mode can use it. Ext4 when got converted to iomap for direct-io already copied it's own variant of __generic_file_fsync() without lock. Hence let's add a helper API and use it both in ext2 and ext4. Later we can review other filesystems as well to see if we can make generic_buffer_fsync() which does not take any inode_lock() as the default path. Tested-by: Disha Goel <disgoel@xxxxxxxxxxxxx> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> --- fs/buffer.c | 43 +++++++++++++++++++++++++++++++++++++ include/linux/buffer_head.h | 2 ++ 2 files changed, 45 insertions(+) diff --git a/fs/buffer.c b/fs/buffer.c index 9e1e2add541e..df98f1966a71 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -593,6 +593,49 @@ int sync_mapping_buffers(struct address_space *mapping) } EXPORT_SYMBOL(sync_mapping_buffers); +/** + * generic_buffer_fsync - generic buffer fsync implementation + * for simple filesystems with no inode lock + * + * @file: file to synchronize + * @start: start offset in bytes + * @end: end offset in bytes (inclusive) + * @datasync: only synchronize essential metadata if true + * + * This is a generic implementation of the fsync method for simple + * filesystems which track all non-inode metadata in the buffers list + * hanging off the address_space structure. + */ +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end, + bool datasync) +{ + struct inode *inode = file->f_mapping->host; + int err; + int ret; + + err = file_write_and_wait_range(file, start, end); + if (err) + return err; + + ret = sync_mapping_buffers(inode->i_mapping); + if (!(inode->i_state & I_DIRTY_ALL)) + goto out; + if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) + goto out; + + err = sync_inode_metadata(inode, 1); + if (ret == 0) + ret = err; + +out: + /* check and advance again to catch errors after syncing out buffers */ + err = file_check_and_advance_wb_err(file); + if (ret == 0) + ret = err; + return ret; +} +EXPORT_SYMBOL(generic_buffer_fsync); + /* * Called when we've recently written block `bblock', and it is known that * `bblock' was for a buffer_boundary() buffer. This means that the block at diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index 8f14dca5fed7..3170d0792d52 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -211,6 +211,8 @@ int inode_has_buffers(struct inode *); void invalidate_inode_buffers(struct inode *); int remove_inode_buffers(struct inode *inode); int sync_mapping_buffers(struct address_space *mapping); +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end, + bool datasync); void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len); static inline void clean_bdev_bh_alias(struct buffer_head *bh) -- 2.39.2