Re: [PATCH 00/15] [RFC] Maintenance jobs and job runner

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey,

On 2020-04-07 at 22:23:43, Johannes Schindelin wrote:
> > If there are periodic tasks that should be done, even if only on large
> > repos, then let's have a git gc --periodic that does them.  I'm not sure
> > that fetch should be in that set, but nothing prevents users from doing
> > "git fetch origin && git gc --periodic".
> 
> Hmm. Who says that maintenance tasks are essentially only `gc`? With
> _maaaaaybe_ a `fetch` thrown in?

What I'm saying is that we have a tool to run maintenance tasks on the
repository.  If we need to perform additional maintenance tasks, let's
put them in the same place as the ones we have now.  I realize "gc" may
become a less accurate name, but oh, well.

> > Let's make it as simple and straightforward as possible.
> 
> I get the impression, however, that many reviewers here seem to favor the
> goal of making the _patches_ as simple and straightforward as possible,
> however, at the expense of the original goal. Like, totally sacrificing
> the ease of use in return for "just use a shell script" advice.

I think we can have both.  They are not mutually exclusive, and I've
proposed a suggestion for both.

> > As for handling multiple repositories, the tool to do that could be as
> > simple as a shell script which reads from ~/.config/git/repo-maintenance
> > (or whatever) and runs the same command on all of the repos it finds
> > there, possibly with a subcommand to add and remove repos.
> 
> Sure, that is flexible.
> 
> And it requires a ton of Git expertise to know what to put into those
> scripts. And Git updates cannot deliver more value to those scripts.

Perhaps I was unclear what I thought could be the design of this.  My
proposal is something like the following:

  git schedule-gc add [--period=TIME] [--fetch=REMOTE | --fetch-all] REPO
  git schedule-gc remove REPO

The actual command invoked by the system scheduler would be something
like the following:

  git schedule-gc run

It would work as I proposed under the hood, but it would be relatively
straightforward to use.

> > I'm not opposed to seeing a tool that can schedule periodic maintenance
> > jobs, perhaps in contrib, depending on whether other people think it
> > should go.  However, I think running periodic jobs is best handled on
> > Unix with cron or anacron and not a custom tool or a command in Git.
> 
> Okay, here is a challenge for you: design this such that the Windows
> experience does _not_ feel like a 3rd-class citizen. Go ahead. Yes, there
> is a scheduler. Yep, it does not do cron-like things. Precisely: you have
> to feed it an XML to make use of the "advanced" features. Yeah, I also
> cannot remember what the semantics are regarding missed jobs due to
> shutdown cycles. Nope, you cannot rely on the XML being an option, that
> would require Windows 10. The list goes on.

I will freely admit that I know next to nothing about Windows.  I have
used it only incidentally, if at all, for at least two decades.  It is
not a platform I generally have an interest in developing for, although
I try to make it work as well as possible when I am working on a project
which supports it.

It is, in general, my assumption, based on its wide usage, that it is a
powerful and robust operating system with many features, but I have
little actual knowledge about how it functions or the exact features it
provides.

I want a solution that builds on the existing Unix tools for Unix,
because that is least surprising to users and it is how Unix tools are
supposed to work.  I think we can agree that Git was designed with the
Unix philosophy in mind.

I also want a solution that works on Windows.  Ideally that solution
would build on existing components that are part of Windows, because it
reduces the maintenance burden on all of us.  But unfortunately, I know
next to nothing about how to build such a solution.

> > I've dealt with systems that implemented periodic tasks without using
> > the existing tools for doing that, and I've found that usually that's a
> > mistake.  Despite seeming straightforward, there are a lot of tricky
> > edge cases to deal with and it's easy to get wrong.
> 
> But maybe you found one of those issues in Stolee's patches? If so, please
> do contribute your experience there to point out those issues, so that
> they can be addressed.

One of the benefits of using anacron on Unix is that it can skip running
tasks when the user is on battery.  This is not anything we can portably
do across systems, nor is it something that Git should need to know
about.

> > We also don't have to reimplement all the features in the system
> > scheduler and can let expert users use a different tool of their choice
> > instead if cron (or the Windows equivalent) is not to their liking.
> 
> Do we really want to start relying on `cron`, when the major platform used
> by the target audience (enterprise software engineers who deal with rather
> larger repositories than git.git or linux.git) quite obviously _lacks_
> support for that?

Unix users will be unhappy with us if we use our own scheduling system
when cron is available.  They will expect us to reimplement those
features and they will complain if we do not.  While I cannot name
names, there are a nontrivial number of large, enterprise monorepos that
run only on macOS and Linux.

That doesn't prevent us from building tooling that does the scheduling
on Windows if we can't use the system scheduler, but it would be nice to
try to present a relatively unified interface across the two platforms.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux