Re: Ext3 Question re: Journal and data

JD <jd1008@xxxxxxxxx> · Wed, 19 Apr 2017 20:03:16 -0600

On 04/19/2017 05:07 PM, Rick Stevens wrote:
On 04/19/2017 12:53 PM, JD wrote:
On Tue, Apr 18, 2017 at 9:13 PM, Chris Murphy <lists@xxxxxxxxxxxxxxxxx
<mailto:lists@xxxxxxxxxxxxxxxxx>> wrote:

     On a journaled filesystem, data and journal only are committed with
     sync(). You have to umount or remount readonly to get all filesystem
     metadata to commit.

     After sync () it's expected you can crash, and the filesystem will
     be made consistent at next remount when the journal is replayed.

     If anything tries to find files with data committed, journal
     committed, but fs metadata not committed: such as GRUB or debug
     tools, they will fail.

     Another option is to freeze/unfreeze. That was originally an XFS
     feature, but is now generic capability. What I'm not totally sure
     about off hand is whether the XFS user space tools is what to use
     for any filesystem,  I'm pretty sure that it is.

     Chris Murphy

Could you explain what the journal is holding: The User Data, the
Metadata, or Both?
If both, should not a sync clear the contents of the journal after the
completion of a sync (assuming no other io operation was done after the
sync)?
If not both (i.e. ONLY metadata), then replaying the journal only
preserves the metadata that describe the files (name, mode, ...etc).

Another question: What if the FS is mounted with the SYNC option in
fstab, such as:
UUID=71af3828-c4cd-2d26-b1f7-8337def05b8c   /sdd1   ext3 sync,rw     0 0

Would that cause immediate commit of the DATA, or would that cause the
commit of the METADATA?
What the journal holds depends on how you mount the filesystem. The
default mode is called "ordered" mode, and data is written directly to
the filesystem BEFORE metadata is written to the journal. The journal is
flushed into the directory inodes and such periodically by the kernel
(or a sync call).

In "journal" mode, ALL data (both raw data and metadata) is written to
the journal and committed to the filesystem periodically or by a sync
call. This is the slowest mode but guarantees consistency and also
requires the biggest journal (since you're also journalling data).

In "writeback" mode, there's no guarantee when the raw data is written
to the filesystem (it could be after the metadata is put into the
journal). "writeback" is the fastest, but can cause old data to appear
after a crash and journal playback because the journal would have the
new metadata but the new data hadn't been written before the crash.
IMHO, the slight improvement in I/O using writeback mode isn't worth the
risk, but every application and environment is different.

You can specify how often the journal flush occurs using the "commit"
option of the mount command. The default time is five seconds. You can
make it faster at the expense of more CPU time being sucked up by the
"jdb2" and "kdmflush" threads.

Since for the disks in question, io performance is not my primary goal,
what do you think of the following mount options listed on URL
https://unix.stackexchange.com/questions/78861/what-mount-option-to-use-for-ext3-file-system-to-minimise-data-loss-or-corruptio

auto,exec,relatime,sync,barrier=1,commit=1,data=ordered,data_err=abort,noatime

Thanx

_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx