Re: [PATCH 3/7] e2scrub: add service (cron, systemd) support

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Fri, 2 Mar 2018 12:16:47 -0800

On Thu, Mar 01, 2018 at 04:04:46PM -0700, Andreas Dilger wrote:
> On Mar 1, 2018, at 11:23 AM, Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote:
> > 
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > Add the ability to run the e2scrub utilities as a periodically scheduled
> > system service.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > ---
> > diff --git a/configure.ac b/configure.ac
> > index 6ffff5d..34935b7 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -1391,7 +1391,8 @@ else
> > 
> > dnl
> > +dnl Where do cron jobs go?
> > +dnl
> > +AC_ARG_WITH([crond_dir],
> > +  [AS_HELP_STRING([--with-crond-dir@<:@=DIR@:>@],
> > +	[Install system crontabs into DIR.])],
> > +  [],
> > +  [with_crond_dir=yes])
> > +AS_IF([test "x${with_crond_dir}" != "xno"],
> > +  [
> > +	AS_IF([test "x${with_crond_dir}" = "xyes"],
> > +	  [
> > +		AS_IF([test -d "/etc/cron.d"],
> > +		  [with_crond_dir="/etc/cron.d"])
> > +	  ])
> > +	AC_MSG_CHECKING([for system crontab dir])
> > +	crond_dir="${with_crond_dir}"
> > +	AS_IF([test -n "${crond_dir}"],
> > +	  [
> > +		AC_MSG_RESULT(${crond_dir})
> > +		have_crond="yes"
> > +	  ],
> > +	  [
> > +		AC_MSG_RESULT(no)
> > +		have_crond="no"
> > +	  ])
> > +  ],
> > +  [
> > +	have_crond="disabled"
> > +  ])
> > +AC_SUBST(have_crond)
> > +AC_SUBST(crond_dir)
> > 
> > 
> > diff --git a/scrub/e2scrub_all.cron.in b/scrub/e2scrub_all.cron.in
> > new file mode 100644
> > index 0000000..0c133bd
> > --- /dev/null
> > +++ b/scrub/e2scrub_all.cron.in
> > @@ -0,0 +1,2 @@
> > +30 3 * * 0 root test -e /run/systemd/system || @root_sbindir@/e2scrub_all
> > +10 3 * * * root test -e /run/systemd/system || @libdir@/e2scrub_reap
> > diff --git a/scrub/e2scrub_all.in b/scrub/e2scrub_all.in
> > index ff9eb8f..5da9a42 100644
> > --- a/scrub/e2scrub_all.in
> > +++ b/scrub/e2scrub_all.in
> > @@ -24,6 +24,22 @@ types="ext2,ext3,ext4"
> > exitcode() {
> > 	ret="$1"
> > 
> > +	# If we're being run as a service, the return code must fit the LSB
> > +	# init script action error guidelines, which is to say that we
> > +	# compress all errors to 1 ("generic or unspecified error", LSB 5.0
> > +	# section 22.2) and hope the admin will scan the log for what
> > +	# actually happened.
> > +
> > +	# We have to sleep 2 seconds here because journald uses the pid to
> > +	# connect our log messages to the systemd service.  This is critical
> > +	# for capturing all the log messages if the scrub fails, because the
> > +	# fail service uses the service name to gather log messages for the
> > +	# error report.
> > +	if [ -n "${SERVICE_MODE}" ]; then
> > +		test "${ret}" -ne 0 && ret=1
> 
> It isn't clear that this is doing what you expect?  "test N" returns 0 for
> all values of N, so this is just making ret=1 always.  It would be more clear
> to use:
> 
> 		[ $ret -ne 0 ] && ret=1

I don't follow... the code that's there does exactly the same thing as
this.  'test' and '[' are the same thing in bash.

> 
> 
> > +		test -x "${SLEEP_PROG}" && "${SLEEP_PROG}" 2
> 
> Why even depend on an external sleep?  You can just use the shell built-in,
> if it is available.

Right.

> > +	fi
> > +
> > 	exit "${ret}"
> > }
> > 
> > @@ -39,6 +55,8 @@ prog_path() {
> > 
> > LVS_PROG="$(prog_path "@root_sbindir@/lvs" "lvs")"
> > BLKID_PROG="$(prog_path "@root_sbindir@/blkid" "blkid")"
> > +SYSTEMCTL_PROG="$(prog_path "@root_bindir@/systemctl")"
> > +SLEEP_PROG="$(prog_path "@root_bindir@/sleep")"
> 
> More obfuscation that should be removed, IMHO.

Fixed.

> > # Scrub any fs on lvm by creating a snapshot and fscking that.
> > "${LVS_PROG}" -o vg_name,lv_name,lv_role --noheadings 2> /dev/null | while read vg lv role extra; do
> > @@ -51,7 +69,15 @@ BLKID_PROG="$(prog_path "@root_sbindir@/blkid" "blkid")"
> > 	# Skip non-ext[234]
> > 	"${BLKID_PROG}" -p -n "${types}" "${dev}" > /dev/null 2>&1 || continue
> > 
> > -	${DBG} "@root_sbindir@/e2scrub" "${dev}"
> > +	if [ ! -x "${SYSTEMCTL_PROG}" ]; then
> > +		${DBG} "@root_sbindir@/e2scrub" "${dev}"
> > +	else
> > +		${DBG} "${SYSTEMCTL_PROG}" start "e2scrub@${dev}" 2> /dev/null
> > +		res=$?
> > +		if [ "${res}" -ne 0 ] && [ "${res}" -ne 1 ]; then
> > +			${DBG} "@root_sbindir@/e2scrub" "${dev}"
> > +		fi
> > +	fi
> > done
> > 
> > exitcode 0
> > diff --git a/scrub/e2scrub_fail.in b/scrub/e2scrub_fail.in
> > new file mode 100644
> > index 0000000..c1696a0
> > --- /dev/null
> > +++ b/scrub/e2scrub_fail.in
> > @@ -0,0 +1,26 @@
> > +#!/bin/bash
> > +
> > +# Email logs of failed e2scrub unit runs
> > +
> > +mailer=/usr/sbin/sendmail
> 
> It would be better not to hard-code the mailer path.

Ok.

> > +(cat << ENDL
> > +To: $1
> > +From: <e2scrub@${hostname}>
> > +Subject: e2scrub failure on ${device}
> > +
> > +So sorry, the automatic e2scrub of ${device} on ${hostname} failed.
> > +
> > +A log of what happened follows:
> > +ENDL
> > +systemctl status --full --lines 4294967295 "e2scrub@${device}") | "${mailer}" -t -i
> 
> This shouldn't depend on systemd to generate the error log.  Why not just
> save the log in /tmp and then send it?  That would also work for cron on
> non-systemd systems.  Or does the e2scrub command just print to stdout/stderr
> and that is captured and mailed by cron directly?

e2scrub_fail is called by systemd when the e2scrub@ systemd service
fails.  cron captures and mails e2scrub error output directly, so the
cron job does not call e2scrub_fail.

I'll amend the comment in e2scrub_fail to note that its purpose is to
mail things out when the systemd service fails.

--D

> 
> 
> Cheers, Andreas
> 
> 
> 
> 
>