On Jan 25, 2008 21:02 -0500, Bryan Kadzban wrote: > > I suspect that a nice email to the XFS and JFS folks would get them to add > > some mechanism to force a filesystem check on the next reboot. > > Is the issue that those FSes don't have any such mechanism today, or is > it just that I don't know how to do this on them? I don't think they have any such mechanism (at least not one that I know about), but I think they will find it useful to add. > (Should fsck.xfs perhaps just exec xfs_check and pass it all the args? > That's a whole separate discussion, probably.) Right... > Create a script to transparently run fsck in the background on any > active LVM logical volumes, as long as the machine is on AC power, and > that LV has been last checked more than a configurable number of days > ago. Also create an optional configuration file to set various options > in the script. > > Signed-Off-By: Bryan Kadzban <bryan@xxxxxxxxxxxxxxxxxxxxx> > #!/bin/sh > # > # lvcheck > > # send $2 to syslog, with severity $1 > # severities are emerg/alert/crit/err/warning/notice/info/debug > function log() { > local sev="$1" > local msg="$2" > local arg= > > # log warning-or-higher messages to stderr as well > [ "$sev" == "emerg" || "$sev" == "alert" || "$sev" == "crit" || \ > "$sev" == "err" || "$sev" == "warning" ] && arg=-s > > logger $arg -p user."$sev" -- "$msg" > } This should use "-t lvcheck" so that it reports what program is generating the message. > # attempt to force a check of $1 on the next reboot > function try_force_check() { > local dev="$1" > local fstype="$2" > > case "$fstype" in > ext2|ext3) > tune2fs -C 16000 -T "19000101" "$dev" I'm a tiny bit reluctant to overwrite the "last checked" date, since this might be useful information for the administrator (i.e. it will tell the interval wherein the corruption was detected). Setting the "mount count" is enough to force a check, and the mount count itself can be reverse engineered from "reboot" messages in the "last" log. > # attempt to set the last-check time on $1 to now, and the mount count to 0. > function try_delay_checks() { > local dev="$1" > local fstype="$2" > > case "$fstype" in > ext2|ext3) It is a lot clearer if the "cases" (ext2|ext3|ext4) are aligned with the "case" statement, like below, since that provides a better separation: case "$fstype" in ext2|ext3|ext4) tune2fs -C 0 -T now "$dev" ;; > reiserfs) > # do nothing? ;; I thought you were going to remove the empty reiserfs cases? > # check the FS on $1 passively, saving output to $3. > function perform_check() { > local dev="$1" > local fstype="$2" > local tmpfile="$3" > > case "$fstype" in > ext2|ext3) Ditto on indenting the cases. > # do everything needed to check and reset dates and counters on /dev/$1/$2. > function check_fs() { > local vg="$1" > local lv="$2" > local fstype="$3" > local snapsize="$4" > > local tmpfile=`mktemp -t e2fsck.log.XXXXXXXXXX` Shouldn't be "e2fsck.log"? Maybe "lvcheck.log.XXXXXXXXX"? > local errlog="/var/log/lvcheck-${vg}@${lv}-`date +'%Y%m%d'`" > local snaplvbase="${lv}-lvcheck-temp" > local snaplv="${snaplvbase}-`date +'%Y%m%d'`" > > # clean up any left-over snapshot LVs > for lvtemp in /dev/${vg}/${snaplvbase}* ; do > if [ -e "$lvtemp" ] ; then > # Assume the script won't run more than one instance at a time? > lvremove -f "${lvtemp##/dev}" Should check the error return and bail out of script if there is an error. > # parse up lvscan output > lvscan 2>&1 | grep ACTIVE | awk '{print $2;}' | \ > while read DEV ; do > > if [ "$SNAPSIZE" -gt "$SPACE" ] ; then > log "err" "Can't take a snapshot of $DEV: not enough free space in the VG." > continue Well, the 1/500 rule is only a guideline. For example, I have a huge filesystem for TV shows, but it doesn't change that often, so it would make more sense to just reduce $SNAPSIZE to $SPACE (assuming some minimum amount of free space is available). Make a default, that is settable in the .conf file: MINFREE=0 # megabytes to leave free in each volume group MINSNAP=256 # megabytes for minimum snapshot size. # make snapshot large enough to handle e.g. journal and other updates [ $SNAPSIZE -lt $MINSNAP ] && SNAPSIZE=$MINSNAP # limit snapshot to available space [ $SNAPSIZE -gt $((SPACE - MINFREE)) ] && SNAPSIZE=$((SPACE - MINFREE)) # if we don't have enough space, skip this check if [ $SNAPSIZE -lt $MINSNAP ]; then log "warning" "Check of $LV can't get ${SNAPSIZE}MB, skipping" continue fi Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users