On Thu, Nov 04, 2021 at 05:29:59PM +0800, Zhongwei Cai wrote: > On 11/3/21 8:28 AM, Dave Chinner wrote: > > IOWs, we expect the IOMAP_F_DIRTY flag to be set on all types of > > iomap mapping calls if the inode is dirty, not just IOMAP_WRITE > > calls. > > Thanks for correction! > > > /* > > * Flags reported by the file system from iomap_begin: > > * > > * IOMAP_F_NEW indicates that the blocks have been newly allocated > > * and need zeroing for areas that no data is copied to. > > * > > * IOMAP_F_DIRTY indicates the inode has uncommitted metadata needed > > * to access written data and requires fdatasync to commit them to > > * persistent storage. This needs to take into account metadata > > * changes that *may* be made at IO completion, such as file size > > * updates from direct IO. > > * > > * IOMAP_F_SHARED indicates that the blocks are shared, and will > > * need to be unshared as part a write. > > * > > * IOMAP_F_MERGED indicates that the iomap contains the merge of > > * multiple block mappings. > > * > > * IOMAP_F_BUFFER_HEAD indicates that the file system requires the > > * use of buffer heads for this mapping. > > */ > > According to the comments in include/linux/iomap.h, it seems other > flags in the iomap indicates the iomap-related status, but the > IOMAP_F_DIRTY flag only indicates the status of the inode. So can > I_DIRTY_INODE or I_DIRTY_PAGES flags in the inode replace it? No. Some filesystems don't track inode metadata dirty status using the VFS inode; instead they track it more efficiently in internal inode and/or journal based structures. Hence the only way to get "inode needs journal flush for data stability" information to generic IO code is to have a specific per-IO mapping flag for it. > And for ext4 at least we can do > > /* > * Writes that span EOF might trigger an I/O size update on completion, > * so consider them to be dirty for the purpose of O_DSYNC, even if > * there is no other metadata changes being made or are pending. > */ > if (ext4_inode_datasync_dirty(inode) || > - offset + length > i_size_read(inode)) > + ((flags & IOMAP_WRITE) && offset + length > i_size_read(inode))) > iomap->flags |= IOMAP_F_DIRTY; > > , since only writes that span EOF might trigger an update. How > do you feel about it? ext4 can do what it likes here, but given that the problem that was being addressed was avoiding lock contention in ext4_inode_datasync_dirty(), this appears to be a completely unnecessary change as it doesn't address the problem being reported. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx