Re: Git for games working group

John Austin <john@xxxxxxxxxxxxxxxxxxxx> · Sun, 23 Sep 2018 10:28:36 -0700

I've been putting together a prototype file-locking implementation for
a system that plays better with git. What are everyone's thoughts on
something like the following? I'm tentatively labeling this system
git-sync or sync-server. There are two pieces:

1. A centralized repository called the Global Graph that contains the
union git commit graph for local developer repos. When Developer A
makes a local commit on branch 'feature', git-sync will automatically
push that new commit up to the global server, under a name-spaced
branch: 'developera_repoabcdef/feature'. This can be done silently as
a force push, and shouldn't ever interrupt the developer's workflow.
Simple http queries can be made to the Global Graph, such as "Which
commits descend from commit abcdefgh?"

2. A client-side tool that queries the Global Graph to determine when
your current changes are in conflict with another developer. It might
ask "Are there any commits I don't have locally that modify
lockable_file.bin?". This could either be on pre-commit, or for more
security, be part of a read-only marking system ala Git LFS. There
wouldn't be any "lock" per say, rather, the client could refuse to
modify a file if it found other commits for that file in the global
graph.

The key here is the separation of concerns. The Global Graph is fairly
dimwitted -- it doesn't know anything about file locking. But it
provides a layer of information from which we can implement file
locking on the client side (or perhaps other interesting systems).

Thoughts?
On Mon, Sep 17, 2018 at 10:23 AM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
>
> On Mon, Sep 17 2018, Joey Hess wrote:
>
> > Ævar Arnfjörð Bjarmason wrote:
> >> There's surely other aspects of that square peg of large file tracking
> >> not fitting the round hole of file locking, the point of my write-up was
> >> not that *that* solution is perfect, but there's prior art here that's
> >> very easily adopted to distributed locking if someone wanted to scratch
> >> that itch, since the notion of keeping a log of who has/hasn't gotten a
> >> file is very similar to a log of who has/hasn't locked some file(s) in
> >> the tree.
> >
> > Actually they are fundamentally very different. git-annex's tracking of
> > locations of files is eventually consistent, which of course means that
> > at any given point in time it may be currently inconsistent. That is
> > fine for tracking locations of files, but not for locking.
> >
> > When git-annex needs to do an operation that relies on someone else's
> > copy of a file actually being present, it uses real locking. That
> > locking is not centralized, instead it relies on the connections between
> > git repositories. That turns out to be sufficient for git-annex's own
> > locking needs, but it would not be sufficient to avoid file edit
> > conflict problems in eg a split brain situation.
>
> Right, all of that's true. I forgot to explicitly say what I meant by
> "locking" in this context. Clearly it's not suitable for something like
> actual file locking (in the sense of flock() et al), but rather just
> advisory locking in the loosest sense of the word, i.e. some git-ish way
> of someone writing on the office whiteboard "unless you're Bob, don't
> touch main.c today Tuesday Sep 17th, he's hacking on it".
>
> So just a way to have some eventually consistent side channel to pass
> such a message through git. Something similar to what git-annex does
> with its "git-annex" branch would work for that, as long as everyone who
> wanted get such messages ran some equivalent of "git annex sync" in a
> timely manner (or checked the office whiteboard every day...).
>
> Such a schema is never going to be 100% reliable even in centralized
> source control systems, e.g. even with cvs/perforce you might pull the
> latest changes, then go on a plane and edit the locked main.c. Then the
> lock has "failed" in the sense of "the message didn't get there in time,
> and two people who could have just picked different areas to work on
> made conflicting edits".
>
> As noted upthread this isn't my use-case, I just wanted to point the
> git-annex method of distributing metadata as a bolt-on to git as
> interesting prior art. If someone wants "truly distributed, but with
> file locking like cvs/perforce" something like what git-annex is doing
> would probably work for them.
>