Re: ext3 data=journal?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 29, 2009 at 6:17 AM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:
> On Tue, Sep 29, 2009 at 2:51 AM, Venkatesh Srinivas <me@xxxxxxxxxxx> wrote:
>> Hi,
>>
>> Thanks for the reply!
>>
>> I was looking for more information on data=journal, not data=ordered
>> or data=writeback; I didn't see comments on it on the ext wiki page,
>> or two of the three links. Do you know anyplace I could look for more
>> on that?
>>
>> Thanks,
>> -- vs
>>
>
> I guessed fs/ext3/fsync.c:ext3_sync_file() best summarized it:
>
>     53         /*
>     54          * data=writeback:
>     55          *  The caller's filemap_fdatawrite()/wait will sync the data.
>     56          *  sync_inode() will sync the metadata
>     57          *
>     58          * data=ordered:
>     59          *  The caller's filemap_fdatawrite() will write the data and
>     60          *  sync_inode() will write the inode if it is dirty.
> Then the caller's
>     61          *  filemap_fdatawait() will wait on the pages.
>     62          *
>     63          * data=journal:
>     64          *  filemap_fdatawrite won't do anything (the buffers
> are clean).
>     65          *  ext3_force_commit will write the file data into
> the journal and
>     66          *  will wait on that.
>     67          *  filemap_fdatawait() will encounter a ton of
> newly-dirtied pages
>     68          *  (they were dirtied by commit).  But that's OK -
> the blocks are
>     69          *  safe in-journal, which is all fsync() needs to ensure.
>     70          */
>     71         if (ext3_should_journal_data(inode)) {
>     72                 ret = ext3_force_commit(inode->i_sb);
>     73                 goto out;
>     74         }
>     75
>
> as indicated above, ext3_force_commit() is called, which essentially
> writes to the journal file.   And ext3_should_journal_data() is YES
> when in data=journal mode.
>
> now, that is indeed an extra overhead, as later in the function,
> sync_inode() will be called to write the inode to its data block as
> well.   so data is duplicated both in journal + data block.
>
> make sense?
>

i must also add that there are many other parts of ext3 fs that used
ext3_should_journal_data() logic to increase reliability at the
expense of performance as well.

eg, inode.c:

     84
     85         if (test_opt(inode->i_sb, DATA_FLAGS) ==
EXT3_MOUNT_JOURNAL_DATA ||
     86             (!is_metadata && !ext3_should_journal_data(inode))) {
     87                 if (bh) {
     88                         BUFFER_TRACE(bh, "call journal_forget");
     89                         return ext3_journal_forget(handle, bh);
     90                 }
     91                 return 0;
     92         }
     93
     94         /*
     95          * data!=journal && (is_metadata || should_journal_data(inode))
     96          */
     97         BUFFER_TRACE(bh, "call ext3_journal_revoke");

and then super.c ...... etc etc.

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux