On Thu, Jan 25, 2018 at 03:21:25PM -0600, Eric Sandeen wrote: > On 1/17/18 4:04 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > Create a systemd service unit so that we can run the online scrubber > > under systemd with (somewhat) appropriate containment. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > <continue random review method> > > ... > > > +/* > > + * If we are running as a service, we need to be careful about what > > + * error codes we return to the calling process. > > + */ > > +static bool is_service; > > + > > static void __attribute__((noreturn)) > > usage(void) > > { > > @@ -617,6 +623,9 @@ _("Only one of the options -n or -y may be specified.\n")); > > if (stdout_isatty && !progress_fp) > > progress_fp = fdopen(1, "w+"); > > > > + if (getenv("SERVICE_MODE")) > > + is_service = true; > > + > > /* Find the mount record for the passed-in argument. */ > > if (stat(argv[optind], &ctx.mnt_sb) < 0) { > > fprintf(stderr, > > @@ -717,5 +726,21 @@ _("%s: %llu warnings found.\n"), > > free(ctx.blkdev); > > free(ctx.mntpoint); > > > > + /* > > + * If we're running as a service, bump return code up by 150 to > > + * avoid conflicting with (sysvinit) service return codes. > > + */ > > + if (is_service) { > > + /* > > + * journald queries /proc as part of taking in log > > + * messages; it uses this information to associate the > > + * message with systemd units, etc. This races with > > + * process exit, so delay that a couple of seconds so > > + * that we capture the summary outputs in the job log. > > + */ > > + sleep(2); > > + if (ret) > > + ret += 150; > > + } > > Ok this seems batshit crazy, no offense. I don't blame /you/ ;) You pointed me > at http://refspecs.linuxbase.org/LSB_2.0.1/LSB-PDA/LSB-PDA/iniscrptact.html > but it says that the /initscript/ should exit with specific codes, not that > the application it /calls/ should do so. Yes. At the moment, the systemd service calls xfs_scrub directly, hence it interprets the return code as an initscript error code. So in theory we could create a wrapper script that does all the is_service junk, but now that's another weird little script to break. On the plus side we'd contain the systemd workaround crap to some random wrapper in /usr/lib. Ok, I'll think about it. > > If the status command is given, /the init script/ will return the following exit status codes. > > So maybe this is getting philosophical, if xfs_scrub detects corruption, > does that mean it "failed?" Shouldn't we use your _fail script to interpret > the scrub errors, and then give an appropriate return value back to the thing that > called us? We want the xfs_scrub@ service to return a failure code so that systemd will notice the failure and start the xfs_scrub_fail@ fail service. This "fail service" doesn't receive the error code of the service that failed passed to it, so all it can do is rifle through the journal output for whatever xfs_scrub spat out and email it somewhere. Hm, yeah, all that crap doesn't need to be there for the non-systemd case. > Anyway, can't we do return code mangling as appropriate in the script(s) > that call xfs_scrub, vs. in the binary itself? The above just seems really > odd and non-obvious to me. > > I'll try to plow through more of the scripty stuff in a bit. Yeah. --D > > Thanks, > -Eric > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html