On Thu, Sep 10, 2009 at 08:46:41AM -0700, Curt Wohlgemuth wrote: > On Wed, Sep 9, 2009 at 11:54 PM, Aneesh Kumar > K.V<aneesh.kumar@xxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Sep 09, 2009 at 09:35:40PM -0400, Theodore Tso wrote: > >> On Wed, Sep 09, 2009 at 05:07:28PM -0700, Curt Wohlgemuth wrote: > >> > > >> > First, ext4_journal_forget() is called from ext4_forget() only when > >> > we're journalling; without a journal, ext4_journal_forget() is only > >> > called for various non-extent paths. ext4_forget() could be changed, > >> > of course... > >> > >> Ext4_forget() calls either ext4_journal_forget() or > >> ext4_journal_revoke(). So we need to fix up both functions. > >> > >> - Ted > >> > >> commit 4afdf0958f6f7b878e6d85cb4e0c0c12a0bd74e2 > >> Author: Theodore Ts'o <tytso@xxxxxxx> > >> Date: Wed Sep 9 21:32:41 2009 -0400 > >> > >> ext4: Use bforget() in no journal mode for ext4_journal_{forget,revoke}() > >> > >> When ext4 is using a journal, a metadata block which is deallocated > >> must be passed into the journal layer so it can be dropped from the > >> current transaction and/or revoked. This is done by calling the > >> functions ext4_journal_forget() and ext4_journal_revoke(), which call > >> jbd2_journal_forget(), and jbd2_journal_revoke(), respectively. > >> > >> Since the jbd2_journal_forget() and jbd2_journal_revoke() call > >> bforget(), if ext4 is not using a journal, ext4_journal_forget() and > >> ext4_journal_revoke() must call bforget() to avoid a dirty metadata > >> block overwriting a block after it has been reallocated and reused for > >> another inode's data block. > >> > > > > I am sure i am missing something. But where are we adding the buffer_head > > to the mapping->private_list ?. For ext2 when we allocate meta data blocks > > we do mark_buffer_dirty_inode which add the buffer_head to the inodes > > private_list. Shouldn't we do something similar with Ext4 without journal ? > > As Ted explained to me, all buffer heads pointing to metadata blocks > are attached to the block device inode. So pdflush writes of these > pages go through the block device address space ops. Explicit > sync_dirty_buffer() calls for the metadata buffer heads still work, of > course. But how would it work for fsync ? I mean I would expect for no journal mode ext4_sync_file should be doing simple_fsync(). That should be forcing the metadata buffer_heads via sync_mapping_buffers. And if we reuse these meta buffers we drop them the inode->mapping->private_list using bforget. But I don't see any of the above in code -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html