Re: Synchronous replication on push

Taylor R Campbell <git@xxxxxxxxxxxxxxxxxxx> · Tue, 5 Nov 2024 01:34:32 +0000

> Date: Mon, 4 Nov 2024 18:47:05 -0500
> From: Jeff King <peff@xxxxxxxx>
> 
> On Sat, Nov 02, 2024 at 02:06:53AM +0000, Taylor R Campbell wrote:
> 
> > Whenever I push anything to it, I want the push -- that is, all the
> > objects, and all the ref updates -- to be synchronously replicated to
> > another remote repository, the back end:
> 
> This isn't quite how replication works at, say, GitHub. But let me first
> explain some of what you're seeing, and then I'll give some higher level
> comments at the end.

Great, thanks!  I understand Github works differently, and I'm not
trying to replicate everything about Github's architecture, which I
expect to take substantial novel software engineering effort.  But I
am trying to make sure I understand how the parts fit together well
enough provide qualitatively similar types of guarantees about
durability when the user's `git push' exits nonzero.

I really have two different goals here, which have similar needs for
relaying pushes but which I'm sure will diverge at some point:

1. provide a synchronous push/pull git frontend to an hg backend with
   git-cinnabar (so to ordinary git clients it looks just like an
   ordinary git remote, without needing git-cinnabar), and

2. provide a git frontend that replicates to one or many git backends
   for better resilience to server loss.

>                           Instead, you should disable push's attempt to
> update the local tracking refs. There isn't an option to do that, but
> if you don't have a "fetch" config line, then there are no tracking
> refs. I.e., rather than using "clone --mirror", create your frontend
> repo like this:
> 
>   git init --bare
>   git config remote.backend.url git@xxxxxxxxxxxxxxxxxxx:/repo.git
>   git fetch backend refs/*:refs/*
> 
> And then push won't try to update anything in the frontend repo.

Thanks, that hadn't occurred to me as an option.

>   Side note: there's a small maybe-bug here that I noticed if the
>   backend is on the same local filesystem. In that case
>   GIT_QUARANTINE_PATH remains set for the receive-pack process running
>   on the backend repo, and will refuse to update refs (where it should
>   be safe to do so!). In your example that doesn't happen because
>   GIT_QUARANTINE_PATH does not make it across the ssh connection. But
>   arguably we should be clearing GIT_QUARANTINE_PATH in local_repo_env
>   like we do for GIT_DIR, etc. I don't think you ran into this, but just
>   another hiccup I found while trying to reproduce your situation.

(I did actually run into this, so in my test scripts I have been using

git {clone,config,...} ext::"env -i PATH=$PATH git %s /path/to/backend.git" ...

instead of just

git {clone,config,...} /path/to/backend.git ...

in order to nix GIT_QUARANTINE_PATH from the environment -- and
anything else I might not have thought of -- while running
git-receive-pack on the backend.  But it didn't seem germane to the
problem at hand so I didn't want to clutter up my already somewhat
long question with such details unless someone asked me to share my
reproducer!)

> > 3. Same as (1), but the pre-receive hook assembles a command line of
> > 
> > 	exec git push backend ${new0}:${ref0} ${new1}:${ref1} ...,
> > 
> >    with all the ref updates passed on stdin (ignoring the old values).
> 
> ...yes, this is the correct approach. You're not _quite_ passing all of
> the relevant info, though, because you're ignoring the old value of each
> ref. And ideally you'd make sure you were moving backend's ref0 from
> "old0" to "new0"; otherwise you risk overwriting something that happened
> independently on the backend. Of course that creates new questions,
> like what happens when the frontend and backend get out of sync.

Right -- there will be some combination of --force-with-lease or
pre-receive tests at the other end to handle this.  But for now my
focus is on making git push work in pre-receive at all.

As long as anything out-of-sync leads to noisy failure, possibly
requiring manual intervention, that's good enough for now (and I'm not
(yet) concerned with .

> > 	remote: error: update_ref failed for ref 'refs/heads/main': ref updates forbidden inside quarantine environment
> > 
> >    but somehow the push succeeds in spite of this message, and the
> >    primary and replica both get updated.
> 
> This is again the quarantine issue updating local tracking branches.
> However, we don't consider that a hard error, as updating them is
> opportunistic (we'd get the new values on the next fetch anyway).
> 
> If you drop the refspec as above, you shouldn't see that any more.

Yes, thanks!

> Now back to the main point: is this a good way to do replication? I
> don't think it's _terrible_, but there are two flaws I can see:

These are all good points that I will consider once I get to them now
that I can make progress past the obstacle of local tracking ref
updates in pre-receive git push, thanks.

>   1. You're not kicking off the backend push until the frontend has
>      received and processed the whole pack. So you're doubling the
>      end-to-end latency of the push. In an ideal world you'd actually
>      stream the incoming packfile to the backend, which would doing its
>      own quarantined index-pack[*] on it in real-time. And then when you
>      get to the pre-receive hook, all that's left is for all of the
>      replicas to agree to commit to the ref update.

Git doesn't currently have any hooks for doing this, right?  So
presumably this will require a custom git-receive-pack replacement
that understands the git wire protocol to stream the packfile to
backends (which is what I assume Github's spokes proxies do).

>   2. Using "push" isn't a very atomic way of updating refs. The backends
>      will either accept the push or not, and then the frontend will try
>      to update its refs. What if it fails? What if another push comes in
>      simultaneously? Can they overwrite each other or lose pushed data?
>      Or get the frontend and backends out of sync?

Right -- there's a lot to work out for the three-phase commit part.
One simplification for now is to reject non-fast-forward pushes (and
ref deletion), and to not worry too much about ordering of independent
ref updates or whether I even want serializable isolation or just
read-repeatable or -committed for that.

That said, regarding push atomicity: Suppose users concurrently do

alice$ git push frontend X Y
bob$ git push frontend Y X

That is, there are overlapping ref updates, and suppose Alice and Bob
have incompatible referents for X and Y (non-fast-forward, or they're
using --force-with-lease but not --atomic, or whatever).

When are the locks on X and Y taken relative to pre-receive in the
frontend?  Can the pre-receive hooks for Alice's push and Bob's push
run concurrently or are they serialized by locks on the common refs X
and Y?  This can't deadlock, can it?  (I assume the locks on refs are
taken in a consistent order.)

It's unclear to me from the githooks(5), git-push(1), and
git-receive-pack(1) man pages what the ordering of hooks and ref
locking is, or what serialization guarantees hooks have -- if any.