Hello, On Wed 11-01-12 08:45:17, Surbhi Palande wrote: > Isn't dirty data flushed out in "ordered" mode? as > ext4_jbd2_file_inode() will get called for ordered writes. Thus this > inode's data is flushed at journal commit time through > journal_submit_data_buffers()? Well, not with delayed allocation and also not for example for xfs. So in some special cases it might happen but we cannot really depend on it. > However I do see that we will still have a dirty data problem for > "writeback" and "journalled" mode? For journalled mode, data is treated as metadata so it's the mode where the problems are smallest (although we'd still have problems because even though kjournald writes the data, it clears only buffer dirty bits but not page dirty bits). For writeback mode you are correct. Honza > On Wed, Jan 11, 2012 at 4:10 AM, Jan Kara <jack@xxxxxxx> wrote: > > On Tue 10-01-12 21:38:29, Surbhi Palande wrote: > >> On second thoughts, I fail to see why there is still a race window > >> after this patch. > >> > >> Here are the reasons why i fail to see how the data can be dirtied > >> when all the operations involve a journal: > >> > >> ---------- > >> So here is the problem that we see > >> CPU1 CPU2 > >> Task1 (write operation) Task2 > >> --------------------------------------------------------------------------------------- > >> t1 ext4_journal_start() > >> t2 ext4_journal_start_sb() > >> t3 vfs_check_frozen sb->frozen=SB_FREEZE_WRITE > >> t4 jbd2_journal_start() /* hence forth all processes calling > >> vfs_check_frozen will wait */ > > Note that we call vfs_check_frozen(sb, SB_FREEZE_TRANS) in > > ext4_journal_start_sb(). Thus we start blocking only when s_frozen == > > SB_FREEZE_TRANS and we just ignore s_frozen == SB_FREEZE_WRITE. > > > >> Now, our aim is to stop Task1 from dirtying the page cache ie in > >> starting this transaction. However if it is successful in starting > >> this transaction, then we want to make sure that this transaction is > >> flushed out. > >> Correct? > > Not quite. Flushing a journal will flush dirty metadata but we will still > > have dirty pages (dirty data is not part of any transaction). So in the > > scenarion I describe in > > http://marc.info/?l=linux-fsdevel&m=132585911925796&w=2 > > all metadata changes will be flushed inside ->freeze_fs (at least for > > journalling filesystems) but pages will be left dirty. Is it clearer now? > > > > But your comment makes me realize that the situation is simpler than I > > thought by the fact that we only have to protect paths that create dirty > > data as dirty metadata can be handled by flushing a journal. And there are > > only a few places creating dirty data. So a reasonably clean solution > > shouldn't be that complicated after all. I'll tweak my patch and try it in > > a moment. > > > > Honza > > -- > > Jan Kara <jack@xxxxxxx> > > SUSE Labs, CR -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html