Re: [PATCH 0/2] e2fsprogs: update mkfs defaults

Eric Sandeen <sandeen@xxxxxxxxxx> · Wed, 16 Feb 2011 16:37:14 -0600

On 2/16/11 4:12 PM, Andreas Dilger wrote:
> On 2011-02-16, at 11:12, Eric Sandeen wrote:
>> Anaconda (the Fedora/RHEL installer) had been "fixing up" extN
>> filesystems it created by setting the max mount count and check
>> interval to 0, as well as adding user_xattr to filesystem mount
>> options.
>> 
>> As part of their efforts to stop special-casing around upstream
>> defaults, they've removed these changes upstream.
>> 
>> However, I'd like to at least propose that these changes be made
>> default.
> 
> I'd really prefer instead that the "lvcheck" script be included into
> the distro, instead of changing mke2fs.  That achieves the same end
> result (periodic scrubbing of the filesystem to look for hidden
> errors), without introducing boot-time delays.  Given the size of
> disks today and the undetected bit-error-rate (somewhere around
> 1/10^15 bits or 12TB), I think it is important that there be
> automated scrubbing of the filesystem.

lvcheck is well and good, but is not a panacea; it is useful only
for snapshottable volumes.... and only lvm for now?

> I think the best place to put that script would be in the lvm tools
> (since it is applicable to multiple filesystems), which I think Eric
> has the most leverage in getting accepted (I've been but I'd be OK
> including it with e2fsprogs if there is pushback on that.

device-mapper utilities ended up being a black hole... combination
of "the scripts don't conform to our style" or somesuch, but no real
interest in adopting & fixing them to do so, IIRC.

>> The forced fsck often comes at unexpected and inopportune moments,
>> and even enterprise customers are often caught by surprise when
>> this happens.  Because a filesystem with an error condition will be
>> marked as requiring fsck anyway,
> 
> Any decent RAID array does background scrubbing for integrity
> verification, it doesn't just wait until there is an uncorrectable
> error detected in the block device.  If we can do something proactive
> to prevent this (i.e. lvcheck run by cron.weekly), it is worthwhile.

If the raid went offline for a couple hours at random times to do this,
users would scream too.  This is essentially what the forced fsck does
today.

> I think customers are equally surprised when their server fails
> (remount-ro/panic) due to the kernel detecting an error that might
> have been on disk for weeks or months.

If I were an administrator, I would schedule fscks to avoid this, rather
than rely on a "kludgy hack of using the UUID to derive a random" time
for this to hit...

>> I submit that the time-based and mount-based checks are not
>> particularly useful, and that administrators can schedule fscks on
>> their own time, or tune2fs the enforced intervals if they so
>> choose.
> 
> I think you are projecting your own self-enlightenment onto users
> ;-).  As we see on this list, there are many users that don't even
> back up their critical data, so IMHO taking out "safe by default"
> options is a step in the wrong direction.

Perhaps I'll whip up a s_last_backup_time patch, and refuse to mount if
the user hasn't conformed to our enlightened notions of how often is often
enough, as well.  I could integrate it with dumpe2fs.  ;)

There is "safe by default" and then there is "assuming administrator
responsibilities," IMHO.  I just personally think it's too much.

> Attached is my latest version of the lvcheck script, and a default
> /etc/lvcheck.conf script.  It's been enhanced to include a usage
> message, command-line option parsing to override default parameters,
> and the ability to check snapshots of ext3/4 filesystems with an
> external journal.
> 

The script is great, but has limited application.

Well, anyway, I knew this wouldn't be super popular with everyone,
but figured I'd put it out there for discussion.

-Eric

> Cheers, Andreas
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html