Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/9/2021 5:32 PM, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.

Thank you for working so hard on this systemd integration. I see you
have already received significant feedback on that portion, so I
wanted to focus my review on the other piece that exists in this patch.

> In order to choose which scheduler to use between `cron` and user
> systemd timers, a new option
> `--scheduler=auto|cron|systemd|launchctl|schtasks` has been added to
> `git maintenance start`.
> When `git maintenance start --scheduler=XXX` is run, it not only
> registers `git maintenance run` tasks in the scheduler XXX, it also
> removes the `git maintenance run` tasks from all the other schedulers to
> ensure we cannot have two schedulers launching concurrent identical
> tasks.
> 
> The default value is `auto` which chooses a suitable scheduler for the
> system.

This addition of the --scheduler option should be split into a patch
on its own. It requires significant refactoring of the existing code
in a way that distracts from your systemd work.

I'll highlight the portions of the diff that you could include in
a preliminary patch and save the systemd stuff for an addition on top
of that.

> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceb..f012923333 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -181,6 +181,20 @@ OPTIONS
>  	`maintenance.<task>.enabled` configured as `true` are considered.
>  	See the 'TASKS' section for the list of accepted `<task>` values.
>  
> +--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::
> +	When combined with the `start` subcommand, specify the scheduler
> +	to use to run the hourly, daily and weekly executions of
> +	`git maintenance run`.
> +	The possible values for `<scheduler>` depend on the system: `crontab`
> +	is available on POSIX systems, `systemd-timer` is available on Linux
> +	systems, `launchctl` is available on MacOS and `schtasks` is available
> +	on Windows.
> +	By default or when `auto` is specified, the most appropriate scheduler
> +	for the system is used. On MacOS, `launchctl` is used. On Windows,
> +	`schtasks` is used. On Linux, `systemd-timers` is used if user systemd
> +	timers are available, otherwise, `crontab` is used. On all other systems,
> +	`crontab` is used.
> +

This portion of the docs can be updated on its own (minus the systemd bits).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bc..7c72aa3b99 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1544,6 +1544,15 @@ static const char *get_frequency(enum schedule_priority schedule)
>  	}
>  }
>  
> +static int is_launchctl_available(const char *cmd)
> +{
> +#ifdef __APPLE__
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>  static char *launchctl_service_name(const char *frequency)
>  {
>  	struct strbuf label = STRBUF_INIT;
> @@ -1710,6 +1719,15 @@ static int launchctl_update_schedule(int run_maintenance, int fd, const char *cm
>  		return launchctl_remove_plists(cmd);
>  }
>  
> +static int is_schtasks_available(const char *cmd)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>  static char *schtasks_task_name(const char *frequency)
>  {
>  	struct strbuf label = STRBUF_INIT;
> @@ -1872,6 +1890,28 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
>  		return schtasks_remove_tasks(cmd);
>  }
>  
> +static int is_crontab_available(const char *cmd)
> +{
> +	static int cached_result = -1;
> +	struct child_process child = CHILD_PROCESS_INIT;
> +
> +	if (cached_result != -1)
> +		return cached_result;
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_push(&child.args, "-l");
> +	child.no_stdin = 1;
> +	child.no_stdout = 1;
> +	child.no_stderr = 1;
> +	child.silent_exec_failure = 1;
> +
> +	if (start_command(&child))
> +		return cached_result = 0;
> +	/* Ignore exit code, as an empty crontab will return error. */
> +	finish_command(&child);
> +	return cached_result = 1;
> +}
> +

These is_X_available() methods are valuable helpers. Adding
is_systemd_timer_available() in the second patch will be
simpler with this framework in place.

> +enum scheduler {
> +	SCHEDULER_INVALID = -1,
> +	SCHEDULER_AUTO = 0,
> +	SCHEDULER_CRON = 1,
> +	SCHEDULER_SYSTEMD = 2,
> +	SCHEDULER_LAUNCHCTL = 3,
> +	SCHEDULER_SCHTASKS = 4,
> +};
> +
> +static const struct {
> +	int (*is_available)(const char *cmd);
> +	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
> +	const char *cmd;
> +} scheduler_fn[] = {
> +	[SCHEDULER_CRON] = { is_crontab_available, crontab_update_schedule,
> +			     "crontab" },
> +	[SCHEDULER_SYSTEMD] = { is_systemd_timer_available,
> +				systemd_timer_update_schedule, "systemctl" },
> +	[SCHEDULER_LAUNCHCTL] = { is_launchctl_available,
> +				  launchctl_update_schedule, "launchctl" },
> +	[SCHEDULER_SCHTASKS] = { is_schtasks_available,
> +				 schtasks_update_schedule, "schtasks" },
> +};

This is also good to include, minus the systemd lines.

I would also like to see this declaration reformatted.
Something like this would be good:

static const struct {
	int (*is_available)(const char *cmd);
	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
	const char *cmd;
} scheduler_fn[] = {
	[SCHEDULER_CRON] = {
		.is_available = is_crontab_available,
		.update_schedule = crontab_update_schedule,
		.cmd = "crontab",
	},
	[SCHEDULER_LAUNCHCTL] = {
		.is_available = is_launchctl_available,
		.update_schedule = launchctl_update_schedule,
		.cmd = "launchctl",
	},
	[SCHEDULER_SCHTASKS] = {
		.is_available = is_schtasks_available,
		.update_schedule = schtasks_update_schedule,
		.cmd = "schtasks",
	},
};

The use of member names can help if we need to augment this
struct later, and the use of commas after the final terms of
each block helps the future diff if we add items to the end.

> +
> +static enum scheduler parse_scheduler(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULER_INVALID;
> +	else if (!strcasecmp(value, "auto"))
> +		return SCHEDULER_AUTO;
> +	else if (!strcasecmp(value, "cron") || !strcasecmp(value, "crontab"))
> +		return SCHEDULER_CRON;
> +	else if (!strcasecmp(value, "systemd") ||
> +		 !strcasecmp(value, "systemd-timer"))
> +		return SCHEDULER_SYSTEMD;
> +	else if (!strcasecmp(value, "launchctl"))
> +		return SCHEDULER_LAUNCHCTL;
> +	else if (!strcasecmp(value, "schtasks"))
> +		return SCHEDULER_SCHTASKS;
> +	else
> +		return SCHEDULER_INVALID;
> +}

Good. The systemd stuff can be added in the second patch,
making a clear integration point.

> +static int maintenance_opt_scheduler(const struct option *opt, const char *arg,
> +				     int unset)
> +{
> +	enum scheduler *scheduler = opt->value;
> +
> +	if (unset)
> +		die(_("--no-scheduler is not allowed"));
> +
> +	*scheduler = parse_scheduler(arg);
> +
> +	if (*scheduler == SCHEDULER_INVALID)
> +		die(_("unrecognized --scheduler argument '%s'"), arg);
> +
> +	return 0;
> +}
> +
> +struct maintenance_start_opts {
> +	enum scheduler scheduler;
> +};

This struct that contains only the enum seems confusing to me.
Maybe it will make sense later.

> +static void resolve_auto_scheduler(enum scheduler *scheduler)
> +{
> +	if (*scheduler != SCHEDULER_AUTO)
> +		return;
> +
>  #if defined(__APPLE__)
> -static const char platform_scheduler[] = "launchctl";
> +	*scheduler = SCHEDULER_LAUNCHCTL;
> +	return;
> +
>  #elif defined(GIT_WINDOWS_NATIVE)
> -static const char platform_scheduler[] = "schtasks";
> +	*scheduler = SCHEDULER_SCHTASKS;
> +	return;
> +
> +#elif defined(__linux__)
> +	if (is_systemd_timer_available("systemctl"))
> +		*scheduler = SCHEDULER_SYSTEMD;
> +	else if (is_crontab_available("crontab"))
> +		*scheduler = SCHEDULER_CRON;
> +	else
> +		die(_("neither systemd timers nor crontab are available"));
> +	return;
> +
>  #else
> -static const char platform_scheduler[] = "crontab";
> +	*scheduler = SCHEDULER_CRON;
> +	return;
>  #endif
> +}

This diff looks pretty rough. I see that you are making
systemctl the default for Linux. Ok. This also seems like
it will not be testable in the test suite.

>  
> -static int update_background_schedule(int enable)
> +static void validate_scheduler(enum scheduler scheduler)
>  {
> -	int result;
> -	const char *scheduler = platform_scheduler;
> -	const char *cmd = scheduler;
> +	const char *cmd;
> +
> +	if (scheduler == SCHEDULER_INVALID)
> +		BUG("invalid scheduler");
> +	if (scheduler == SCHEDULER_AUTO)
> +		BUG("resolve_auto_scheduler should have been called before");
> +
> +	cmd = scheduler_fn[scheduler].cmd;
> +	if (!scheduler_fn[scheduler].is_available(cmd))
> +		die(_("%s scheduler is not available"), cmd);
> +}
> +
> +static int update_background_schedule(const struct maintenance_start_opts *opts,
> +				      int enable)
> +{
> +	unsigned int i;
> +	int res, result = 0;
> +	enum scheduler scheduler;
> +	const char *cmd = NULL;
>  	char *testing;
>  	struct lock_file lk;
>  	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
>  
> +	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> +		return error(_("another process is scheduling background maintenance"));
> +
>  	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
>  	if (testing) {
>  		char *sep = strchr(testing, ':');
>  		if (!sep)
>  			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
>  		*sep = '\0';
> -		scheduler = testing;
> +		scheduler = parse_scheduler(testing);
>  		cmd = sep + 1;
> +		result = scheduler_fn[scheduler].update_schedule(
> +			enable, get_lock_file_fd(&lk), cmd);
> +		goto done;

I see this 'goto done' is the reason we need to take the lock earlier. The
other option would be to put the 'goto done' after the rollback_lock_file(),
but this is fine, too.

>  	}
>  
> -	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> -		return error(_("another process is scheduling background maintenance"));
> -
> -	if (!strcmp(scheduler, "launchctl"))
> -		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "schtasks"))
> -		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "crontab"))
> -		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else
> -		die("unknown background scheduler: %s", scheduler);
> +	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
> +		int enable_scheduler = enable && (opts->scheduler == i);
> +		cmd = scheduler_fn[i].cmd;
> +		if (!scheduler_fn[i].is_available(cmd))
> +			continue;
> +		res = scheduler_fn[i].update_schedule(
> +			enable_scheduler, get_lock_file_fd(&lk), cmd);
> +		if (enable_scheduler)
> +			result = res;
> +	}

This loop is cleaner than the list of else-ifs.

>  
> +done:
>  	rollback_lock_file(&lk);
>  	free(testing);
>  	return result;
>  }
>  
> -static int maintenance_start(void)
> +static const char *const builtin_maintenance_start_usage[] = {
> +	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
> +};
> +
> +static int maintenance_start(int argc, const char **argv, const char *prefix)
>  {
> +	struct maintenance_start_opts opts;

I see you using the struct now and in the helper methods above.
While the creation of the struct looks a little strange now, this
use matches other patterns in this file and is more flexible to
additional options in the future. Thanks.

> +	struct option builtin_maintenance_start_options[] = {
> +		OPT_CALLBACK(
> +			0, "scheduler", &opts.scheduler, N_("scheduler"),
> +			N_("scheduler to use to trigger git maintenance run"),
> +			maintenance_opt_scheduler),

nit: these would typically be aligned with the end of the
OPT_CALLBACK( opener, with the first set of parameters being on
the same line:

		OPT_CALLBACK(0, "scheduler", &opts.scheduler, N_("scheduler"),
			     N_("scheduler to use to trigger 'git maintenance run'"),
			     maintenance_opt_scheduler),

These options frequently run a little long on the line width,
which might have been your motivation in adding an extra line.

> +		OPT_END()
> +	};
> +	memset(&opts, 0, sizeof(opts));
> +
> +	argc = parse_options(argc, argv, prefix,
> +			     builtin_maintenance_start_options,
> +			     builtin_maintenance_start_usage, 0);
> +
> +	resolve_auto_scheduler(&opts.scheduler);
> +	validate_scheduler(opts.scheduler);
> +
> +	if (argc > 0)
> +		usage_with_options(builtin_maintenance_start_usage,
> +				   builtin_maintenance_start_options);

nit: "if (argc)" is the more typical pattern in the Git codebase.

Also, this check should come right after parse_options().

>  	if (maintenance_register())
>  		warning(_("failed to add repo to global config"));
> -
> -	return update_background_schedule(1);
> +	return update_background_schedule(&opts, 1);
>  }
>  
>  static int maintenance_stop(void)
>  {
> -	return update_background_schedule(0);
> +	return update_background_schedule(NULL, 0);
>  }
>  
>  static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
> @@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>  	if (!strcmp(argv[1], "run"))
>  		return maintenance_run(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "start"))
> -		return maintenance_start();
> +		return maintenance_start(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "stop"))
>  		return maintenance_stop();
>  	if (!strcmp(argv[1], "register"))
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 2412d8c5c0..6e6316cd90 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh

I'm not sure how we could achieve it, but it might be good to demonstrate
a use of the --scheduler option here in the test script.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux