-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Bryan Kadzban wrote: > Maybe just using "all available space in the VG" is a better idea > anyway. That's what I did here, at least for now. There's a place in here where the available space in the VG can be checked, but I'm not sure how to get that value out of lvs (or vgs) in a format that's easy to parse, so I skipped it for now as well. (I could only get values like "250m", which I assume means 250 megs, but how is the script supposed to handle the suffixes?) > I suppose it would work to leave the check interval set in the > superblock, and avoid using fsck.* -f; that way each fsck would be > able to determine if it should do a full check or not. Turns out that will *not* work. fsck.* without -f will succeed even if it doesn't check anything (or at least, e2fsck will). So every day, the last-check day will get bumped, even though nothing actually got checked. That defeats the purpose here. I've split out the operations of checking the FS, setting the last-check time to now, setting the last-check time to some time in the ancient past (if the check fails -- this forces the next-reboot check to be a full one), and getting the last-check time, into their own functions. Each one takes a device name and filesystem type argument, and splits execution paths depending on the FS type. Adding support for a new FS (e.g. better support for reiser) should be as easy as modifying the case statements in four functions. > It would probably be possible; I'll see what I can find out later > today. I have a QEMU VM set up whose root FS is on LVM... Well, it was set up. I seem to have somehow nuked the md-raid layer, so the LVM stuff isn't available anymore. (It involved a qemu bug (the VM was running, and suddenly died); then when starting it back up, the md-raid code started a "background rebuild", and ended up locking up qemu. I'll probably have to start over with a new set of image files.) > We'd still need to find the FS type, although I believe udev provides > some programs that may be helpful (if we want to rely on them being > installed). volume_id, in particular, should provide that info. I'm running /lib/udev/vol_id here to get the FS type. I'm not sure if that's the best solution or not, but it does work (at least for now). Anyway, I've also renamed the script from e2check to lvcheck (since it works for more than ext* now). Same changelog entry as before, though. Signed-Off-By: Bryan Kadzban <bryan@xxxxxxxxxxxxxxxxxxxxx> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHmVVjS5vET1Wea5wRA6sLAJ472TUX1amJroWIxdGbqQqlLZrS2QCeLHAA z/fhwCISV3krc/coAmfWlVw= =5gFW -----END PGP SIGNATURE-----
#!/bin/sh # # lvcheck # Released under the GNU General Public License, either version 2 or # (at your option) any later version. # Overview: # # Run this from cron each night. If the machine is on AC power, it # will run the checks; otherwise they will all be skipped. (If the # script can't tell whether the machine is on AC power, a setting in # the configuration file (/etc/lvcheck.conf) decides whether it will # continue with the checks, or abort.) # # The script will then decide which logical volumes are active, and # can therefore be checked via an LVM snapshot. Each of these LVs # will be queried to find its last-check day, and if that was more # than $INTERVAL days ago (where INTERVAL is set in the configuration # file as well), then the script will take an LVM snapshot of that # LV and run fsck on the snapshot. The snapshot will be set to use # all the remaining space in its volume group. After fsck finishes, # the snapshot is destroyed. (Snapshots are checked serially.) # # Any LV that passes fsck will have its last-check time updated (in # the real superblock, not the snapshot's superblock); any LV whose # fsck fails will send an email notification to a configurable user # ($EMAIL). This $EMAIL setting is optional, but its use is highly # recommended, since if any LV fails, it will need to be checked # manually, offline. function on_ac_power() { local any_known=no # try sysfs power class first if [ -d /sys/class/power_supply ] ; then for psu in /sys/class/power_supply/* ; do if [ -r "${psu}/type" ] ; then type="`cat "${psu}/type"`" # ignore batteries [ "${type}" = "Battery" ] && continue online="`cat "${psu}/online"`" [ "${online}" = 1 ] && return 0 [ "${online}" = 0 ] && any_known=yes fi done [ "${any_known}" = "yes" ] && return 1 fi # else fall back to AC adapters in /proc if [ -d /proc/acpi/ac_adapter ] ; then for ac in /proc/acpi/ac_adapter/* ; do if [ -r "${ac}/state" ] ; then grep -q on-line "${ac}/state" && return 0 grep -q off-line "${ac}/state" && any_known=yes elif [ -r "${ac}/status" ] ; then grep -q on-line "${ac}/status" && return 0 grep -q off-line "${ac}/status" && any_known=yes fi done [ "${any_known}" = "yes" ] && return 1 fi if [ "$AC_UNKNOWN" == "CONTINUE" ] ; then return 0 # assume on AC power elif [ "$AC_UNKNOWN" == "ABORT" ] ; then return 1 # assume on battery else echo "Invalid value for AC_UNKNOWN in the config file" >&2 exit 1 fi } # attempt to force a check of $1 on the next reboot function try_force_check() { local dev="$1" local fstype="$2" case "$fstype" in ext2|ext3) tune2fs -C 16000 -T "19000101" "$dev" ;; reiserfs) # ??? echo "Don't know how to set the last-check time on reiserfs..." >&2 ;; *) echo "Don't know how to set the last-check time on $fstype..." >&2 ;; esac } # attempt to set the last-check time on $1 to now, and the mount count to 0. function try_delay_checks() { local dev="$1" local fstype="$2" case "$fstype" in ext2|ext3) tune2fs -C 0 -T now "$dev" ;; reiserfs) # do nothing? apparently so... ;; *) echo "Don't know how to reset the last-check time on $fstype..." >&2 ;; esac } # print the date that $1 was last checked, in a format that date(1) will # accept, or "Unknown" if we don't know how to find that date. function try_get_check_date() { local dev="$1" local fstype="$2" case "$fstype" in ext2|ext3) dumpe2fs -h "$dev" 2>/dev/null | grep 'Last checked:' | \ sed -e 's/Last checked:[[:space:]]*//' ;; *) # TODO: add support for various FSes here echo "Unknown" ;; esac } # check the FS on $1 passively, printing output to $3. function perform_check() { local dev="$1" local fstype="$2" local tmpfile="$3" case "$fstype" in ext2|ext3) # the only point in fixing anything is just to see if fsck can. nice logsave -as "${tmpfile}" fsck.${fstype} -p -C 0 "$dev" && nice logsave -as "${tmpfile}" fsck.${fstype} -fy -C 0 "$dev" return $? ;; reiserfs) echo Yes | nice logsave -as "${tmpfile}" fsck.reiserfs --check "$dev" # apparently can't fail? let's hope not... return 0 ;; *) echo "Don't know how to check $fstype filesystems passively..." >&2 ;; esac } # do everything needed to check and reset dates and counters on /dev/$1/$2. function check_fs() { local vg="$1" local lv="$2" local fstype="$3" local tmpfile=`mktemp -t e2fsck.log.XXXXXXXXXX` trap "rm $tmpfile ; trap - RETURN" RETURN # only one check happens at a time; using all the free space in the VG # at least won't prevent other checks from happening... lvcreate -s -l "100%FREE" -n "${lv}-snap" "${vg}/${lv}" if perform_check "/dev/${vg}/${lv}-snap" "${fstype}" "${tmpfile}" ; then echo 'Background scrubbing succeeded!' try_delay_checks "/dev/${vg}/${lv}" "$fstype" else echo 'Background scrubbing failed! Reboot to fsck soon!' try_force_check "/dev/${vg}/${lv}" "$fstype" if test -n "$EMAIL"; then mail -s "Fsck of /dev/${vg}/${lv} failed!" $EMAIL < $tmpfile fi fi lvremove -f "${vg}/${lv}-snap" } set -e # pull in configuration -- don't bother with a parser, just use the shell's . /etc/lvcheck.conf # check whether the machine is on AC power: if not, skip fsck on_ac_power || exit 0 # parse up lvscan output lvscan 2>&1 | grep ACTIVE | awk '{print $2;}' | \ while read DEV ; do # remove the single quotes around the device name DEV="`echo "$DEV" | tr -d \'`" # get the FS type FSTYPE="`/lib/udev/vol_id -t "$DEV"`" # get the last-check time, throw away the time portion, and # add $INTERVAL days check_date=`try_get_check_date "$DEV" "$FSTYPE"` # if the date is unknown, run fsck every day. sigh. if [ "$check_date" != "Unknown" ] ; then check_day=`date --date="$check_date $INTERVAL days" +'%Y%m%d'` # get today's date, and skip the check if it's not within the interval today=`date +'%Y%m%d'` [ $check_day -gt $today ] && continue fi # get the free space SPACE="`lvs --noheadings -o vg_free "$DEV"`" # ensure that some free space exists, at least # ??? -- can lvs print vg_free in plain numbers, or do I have to # figure out what a suffix of "m" means? skip the check for now. # get the volume group and logical volume names VG="`lvs --noheadings -o vg_name "$DEV"`" LV="`lvs --noheadings -o lv_name "$DEV"`" # check it check_fs "$VG" "$LV" "$FSTYPE" done
#!/bin/sh # e2check configuration variables: # # EMAIL # Address to send failure notifications to. If empty, # failure notifications will not be sent. # # INTERVAL # Days to wait between checks. All LVs use the same # INTERVAL, but the "days since last check" value can # be different per LV, since that value is stored in # the ext2/ext3 superblock. # # AC_UNKNOWN # Whether to run the e2fsck checks if the script can't # determine whether the machine is on AC power. Laptop # users will want to set this to ABORT, while server and # desktop users will probably want to set this to # CONTINUE. Those are the only two valid values. EMAIL='root' INTERVAL=30 AC_UNKNOWN="ABORT"
_______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users