On Thu, Nov 18, 2010 at 9:23 AM, Christoph Hellwig <hch@xxxxxx> wrote: > Avoid doing unessecary work in fsync. Do nothing unless the inode > was marked dirty, and only write the various metadata inodes out if > they contain any dirty state from this inode. This is archived by > adding three new dirty bits to the hfsplus-specific inode which are > set in the correct places. Hmm, do you have some i_state coherency problems here? > > int hfsplus_file_fsync(struct file *file, int datasync) > { > struct inode *inode = file->f_mapping->host; > + struct hfsplus_inode_info *hip = HFSPLUS_I(inode); > struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb); > - int error, error2; > + int error = 0, error2; > + > + if (!(inode->i_state & I_DIRTY)) > + return 0; Does this guy ever get cleared in the synchronous path? Or only in writeback? I couldn't see where it gets cleared from direct writeout, I wonder if we should import the pagecache tagged dirty check from writeback to keep this in sync better? (anyway, separate issue) Anyway, what if writeback noticed pagecache was cleaned at this point, and then clears I_DIRTY bits from inode before you test it above? Won't that leave your metadata not on disk? > > /* > * Sync inode metadata into the catalog and extent trees. > @@ -318,13 +322,21 @@ int hfsplus_file_fsync(struct file *file > /* > * And explicitly write out the btrees. > */ > - error = filemap_write_and_wait(sbi->cat_tree->inode->i_mapping); > - error2 = filemap_write_and_wait(sbi->ext_tree->inode->i_mapping); > - if (!error) > - error = error2; > - error2 = filemap_write_and_wait(sbi->alloc_file->i_mapping); > - if (!error) > - error = error2; > + if (test_and_clear_bit(HFSPLUS_I_CAT_DIRTY, &hip->flags)) > + error = filemap_write_and_wait(sbi->cat_tree->inode->i_mapping); > + > + if (test_and_clear_bit(HFSPLUS_I_EXT_DIRTY, &hip->flags)) { > + error2 = filemap_write_and_wait(sbi->ext_tree->inode->i_mapping); > + if (!error) > + error = error2; > + } > + > + if (test_and_clear_bit(HFSPLUS_I_ALLOC_DIRTY, &hip->flags)) { > + error2 = filemap_write_and_wait(sbi->alloc_file->i_mapping); > + if (!error) > + error = error2; > + } > + > return error; > } > > @@ -590,6 +602,8 @@ int hfsplus_cat_write_inode(struct inode > hfs_bnode_write(fd.bnode, &entry, fd.entryoffset, > sizeof(struct hfsplus_cat_file)); > } > + > + set_bit(HFSPLUS_I_CAT_DIRTY, &HFSPLUS_I(inode)->flags); > out: > hfs_find_exit(&fd); > return 0; So your I_CAT_DIRTY will be set, but not I_DIRTY, AFAIKS. > @@ -530,4 +532,5 @@ out: > hip->fs_blocks = (inode->i_size + sb->s_blocksize - 1) >> sb->s_blocksize_bits; > inode_set_bytes(inode, hip->fs_blocks << sb->s_blocksize_bits); > mark_inode_dirty(inode); > + set_bit(HFSPLUS_I_ALLOC_DIRTY, &HFSPLUS_I(inode)->flags); If you mark these sub-parts of the inode dirty after marking the inode dirty, and no locking, then AFAIKS you get no guarantee they will be noticed at the point that I_DIRTY bits are cleared. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html