Re: [PATCH] maintenance: use systemd timers on Linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@xxxxxxxxx> wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.

Thanks for working on this. While `cron` has been the go-to standard
for decades, `systemd` is certainly widespread enough that it makes
sense to support it, as well.

> The main motivations to implement systemd timers in addition to cron
> are:
> * cron is optional and Linux systems running systemd might not have it
>   installed.
> * The execution of `crontab -l` can tell us if cron is installed but not
>   if the daemon is actually running.
> * With systemd, each service is run in its own cgroup and its logs are
>   tagged by the service inside journald. With cron, all scheduled tasks
>   are running in the cron daemon cgroup and all the logs of the
>   user-scheduled tasks are pretended to belong to the system cron
>   service.
>   Concretely, a user that doesn’t have access to the system logs won’t
>   have access to the log of its own tasks scheduled by cron whereas he
>   will have access to the log of its own tasks scheduled by systemd
>   timer.

The last point is somewhat compelling. A potential counterargument is
that `cron` does send email to the user by default if any output is
generated by the cron job. However, it seems quite likely these days
that many systems either won't have local mail service enabled or the
user won't bother checking the local mailbox. It's a minor point, but
if you re-roll it might make sense for the commit message to expand
the last point by saying that although `cron` attempts to send email,
that email may go unseen by the user.

> In order to schedule git maintenance, we need two unit template files:
> * ~/.config/systemd/user/git-maintenance@.service
>   to define the command to be started by systemd and
> * ~/.config/systemd/user/git-maintenance@.timer
>   to define the schedule at which the command should be run.
> [...]
> The timer unit contains `Persistent=true` so that, if the computer is
> powered down when a maintenance task should run, the task will be run
> when the computer is back powered on.

It would be nice for the commit message to also give some high-level
information about how git-maintenance chooses between `cron` and
`systemd` and whether the user can influence that decision. (I know
the answer because I read the patch, but this is the sort of
information which is good to have in the commit message; readers want
to know why certain choices were made.)

Although I avoid Linux distros with `systemd`, my knee-jerk reaction,
like brian's upthread, is that there should be some escape hatch or
direct mechanism to allow the user to choose between `systemd` and
`cron`.

The patch itself is straightforward enough and nicely follows the
pattern established for already-implemented schedulers, so I don't
have a lot to say about it. I did leave a few comments below, most of
which are subjective nits and minor observations, though there are two
or three actionable items.

> Signed-off-by: Lénaïc Huard <lenaic@xxxxxxxxx>
> ---
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> @@ -279,6 +279,55 @@ schedule to ensure you are executing the correct binaries in your
> +BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMD
> +-----------------------------------------------

Is there a reason for the duplicated "SYSTEMD" that I'm missing? I
suppose you probably mean "SYSTEMD SYSTEMS".

> +In this case, `git maintenance start` will create user systemd timer units
> +and start the timers. The current list of user-scheduled tasks can be found
> +by running `systemctl --user list-timers`. The timers written by `git
> +maintenance start` are similar to this:
> +
> +-----------------------------------------------------------------------
> +$ systemctl --user list-timers
> +NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
> +Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
> +Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
> +Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
> +
> +3 timers listed.
> +Pass --all to see loaded but inactive timers, too.
> +-----------------------------------------------------------------------

I suspect that the "3 timers listed" and "Pass --all" lines don't add
value and can be dropped without hurting the example.

> +`git maintenance start` will overwrite these files and start the timer
> +again with `systemctl --user`, so any customization should be done by
> +creating a drop-in file
> +`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.

Will `systemd` users generally understand what filename to create in
the "...@.service.d/" directory, and will they know what to populate
the file with? (Genuine question; I've never dealt with that.)

> diff --git a/builtin/gc.c b/builtin/gc.c
> @@ -1872,6 +1872,25 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
> +static int is_crontab_available(const char *cmd)
> +{
> +       struct child_process child = CHILD_PROCESS_INIT;
> +
> +       strvec_split(&child.args, cmd);
> +       strvec_push(&child.args, "-l");
> +       child.no_stdin = 1;
> +       child.no_stdout = 1;
> +       child.no_stderr = 1;
> +       child.silent_exec_failure = 1;
> +
> +       if (start_command(&child))
> +               return 0;
> +       /* Ignore exit code, as an empty crontab will return error. */
> +       finish_command(&child);
> +
> +       return 1;
> +}

Ignoring the error from `crontab -l` is an already-established idiom
in this file. Okay.

Nit: There doesn't seem to be a need for the blank line before `return
1`, and other maintenance-related functions don't have such a blank
line. The same comment about blank lines before `return` applies to
other newly-added functions, as well. But it's subjective, and not
necessarily worth changing.

> +static char *systemd_timer_timer_filename()
> +{
> +       const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
> +       char *expanded = expand_user_path(filename, 0);
> +       if (!expanded)
> +               die(_("failed to expand path '%s'"), filename);
> +
> +       return expanded;
> +}

I was curious whether this would fail if `.config/systemd/user/`
didn't already exist, but looking at the implementation of
expand_user_path() , I see that it doesn't require the path to already
exist if you pass 0 for the second argument as you do here. Okay.

> +static char *systemd_timer_service_filename()
> +{
> +       const char *filename =
> +               "~/.config/systemd/user/git-maintenance@.service";
> +       char *expanded = expand_user_path(filename, 0);
> +       if (!expanded)
> +               die(_("failed to expand path '%s'"), filename);
> +
> +       return expanded;
> +}

The duplication of code between systemd_timer_timer_filename() and
systemd_timer_service_filename() is probably too minor to worry about.
Okay.

> +static int systemd_timer_enable_unit(int enable,
> +                                    enum schedule_priority schedule,
> +                                    const char *cmd)
> +{
> +       struct child_process child = CHILD_PROCESS_INIT;
> +       const char *frequency = get_frequency(schedule);
> +
> +       strvec_split(&child.args, cmd);
> +       strvec_push(&child.args, "--user");
> +       if (enable)
> +               strvec_push(&child.args, "enable");
> +       else
> +               strvec_push(&child.args, "disable");

It's subjective, but this might be more nicely expressed as:

    strvec_push(&child.args, enable ? "enable" : "disable");

> +       strvec_push(&child.args, "--now");
> +       strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
> +
> +       if (start_command(&child))
> +               die(_("failed to run systemctl"));
> +       return finish_command(&child);
> +}
> +static int systemd_timer_write_unit_templates(const char *exec_path)
> +{
> +       unit = "[Unit]\n"
> +              "Description=Optimize Git repositories data\n"
> +              "\n"
> +              "[Service]\n"
> +              "Type=oneshot\n"
> +              "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"

I see that it's in POSIX, but do we use this `%n$s` directive
elsewhere in the Git source code? If not, I'd be cautious of
introducing it here. Maybe it's better to just use plain `%s` twice...

> +              "LockPersonality=yes\n"
> +              "MemoryDenyWriteExecute=yes\n"
> +              "NoNewPrivileges=yes\n"
> +              "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
> +              "RestrictNamespaces=yes\n"
> +              "RestrictRealtime=yes\n"
> +              "RestrictSUIDSGID=yes\n"
> +              "SystemCallArchitectures=native\n"
> +              "SystemCallFilter=@system-service\n";
> +       fprintf(file, unit, exec_path);

... and then:

    fprintf(file, unit, exec_path, exec_path);

> +       fclose(file);
> +
> +       return 0;
> +}
> @@ -1986,6 +2159,15 @@ static int update_background_schedule(int enable)
> +       if (!strcmp(scheduler, "crontab_or_systemctl")) {
> +               if (is_systemd_timer_available("systemctl"))
> +                       scheduler = cmd = "systemctl";
> +               else if (is_crontab_available("crontab"))
> +                       scheduler = cmd = "crontab";
> +               else
> +                       die(_("Neither systemd timers nor crontab are available"));
> +       }

Other messages emitted by git-maintenance are entirely lowercase, so
downcasing "Neither" would be appropriate.

> @@ -1995,10 +2177,14 @@ static int update_background_schedule(int enable)
> -               die("unknown background scheduler: %s", scheduler);
> +               die(_("unknown background scheduler: %s"), scheduler);

This change is unrelated to the rest of the patch. Normally, such a
"fix" would be made as a separate patch. This one is somewhat minor,
so perhaps it doesn't matter whether it's in this patch...

>         rollback_lock_file(&lk);
> +       free(lock_path);
>         free(testing);
>         return result;

... however, this leak fix probably deserves its own patch. Or, at the
very least, mention these two fixes in this commit message.

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> @@ -20,6 +20,20 @@ test_xmllint () {
> +test_lazy_prereq SYSTEMD_ANALYZE '
> +       systemd-analyze --help >out &&
> +       grep -w verify out
> +'

Unportable use of `grep -w`. It's neither in POSIX nor understood by
BSD-lineage `grep` (including macOS `grep`).



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux