Re: [PATCH v2 42/43] refs: add LMDB refs backend

David Turner <dturner@xxxxxxxxxxxxxxxx> · Wed, 07 Oct 2015 15:20:29 -0400

On Wed, 2015-10-07 at 20:31 +0200, Michael Haggerty wrote:
> On 10/07/2015 03:51 AM, David Turner wrote:
> > On Mon, 2015-10-05 at 17:47 +0200, Michael Haggerty wrote:
> >> On 09/29/2015 12:02 AM, David Turner wrote:
> >>> Add a database backend for refs using LMDB.  [...]
> >>
> >> I think you have said before that if one writer holds the write lock on
> >> the DB, then other writers fail immediately. Is that correct? If so, is
> >> that something that can be adjusted? I think it would be preferable for
> >> the second writer to retry acquiring the write lock for a little while
> >> with a timeout (as we now do when trying to acquire the packed-refs
> >> lock). Otherwise you could have the unhappy situation that somebody
> >> spends a long time pushing a packfile to a server, only to have the
> >> reference update be rejected at the last moment due to a lock conflict
> >> with another process that was touching completely different references.
> >> We already do before/after consistency checks when updating references,
> >> so you wouldn't even have to add such code in the backend yourself.
> > 
> > No, the second writer waits for the first writer to unlock (or for it to
> > crash).
> 
> Cool, that's better behavior.
> 
> > [...]
> >> Do you store "peeled" reference values for tags, as is done in
> >> packed-refs? I think that is an important optimization.
> > 
> > No.  Do you happen to know in what situations this is a performance
> > benefit, so that I can benchmark?  I suspect it would matter much less
> > for the LMDB backend, because lookups are pretty quick.
> 
> The reference lookup speed is not relevant here. "Peeling" is applied to
> references that point at tag objects (a.k.a. annotated tags). It means
> that the tag object is looked up to see what *it* points at (recursively
> if necessary) and the result is stored to the packed-refs file in a
> specially-formatted extra line that looks like
> 
>     17f9f635c101aef03874e1de1d8d0322187494b3 refs/tags/v2.6.0
>     ^be08dee9738eaaa0423885ed189c2b6ad8368cf0
> 
> I think a good command to benchmark would be `git show-refs -d` in a
> repository with a number of annotated tags. This command's output is
> similar to the output of `git ls-remote <remote>` and also comes up
> during reference negotiation when fetching (so its performance is
> definitely not moot).
> 
> > [...]
> >> Currently we discard the reflog for a reference when the reference is
> >> deleted. [...]
> >> Have you thought about removing this limitation in the lbdb backend?
> > 
> > I'm going for feature parity first.  We can always add new functionality
> > later.  This particular function would be pretty straightforward to add,
> > I think.
> 
> +1
> 
> > [...]
> >>> +The rsync and file:// transports don't work yet, because they
> >>> +don't use the refs API.
> >>
> >> Do they fail gracefully?
> > 
> > Not particularly gracefully.
> > 
> > rsync: link_stat "/home/dturner/git/t/trash
> > directory.t5510-fetch/.git/packed-refs" failed: No such file or
> > directory (2)
> > rsync error: some files/attrs were not transferred (see previous errors)
> > (code 23) at main.c(1183) [sender=3.1.1]
> > fatal: Could not run rsync to get refs
> > -------------
> > 
> > The problem is that rsync on the client assumes that packed-refs exists.
> > We could hack it to also check for refdb.
> 
> I guess this is something that will have to be improved sooner or later,
> though I guess not as a precondition for merging this patch series.
> 
> > [...]
> >> I'm somewhat surprised that you only register the lmdb backend if it is
> >> used in the main repo. I would expect the backend to be registered
> >> unconditionally on startup. The cost is trivial, isn't it?
> > 
> > Yeah, but this was the easiest place to do it.
> 
> OK.
> 
> > [...]
> 
> I'm really happy about your work.
> 
> Regarding strategy: I think a good approach would be to get as much of
> the preparatory work as possible (the abstraction and separation of
> refs-be-files) to the point where it can be merged before there is too
> much more code churn in the area. That work is not very controversial, I
> think, and letting it wait for a long time will increase the likelihood
> of conflicts with other people's changes. The refs-be-lmdb patches, on
> the other hand, (1) will take longer to get polished, (2) will take
> longer to review because other people are not familiar with LDMB, and
> (3) won't bitrot very fast anyway because they don't overlap as much
> with areas that other people are likely to work on. So I would advocate
> working on those at a more deliberate pace and planning for them to be
> merged as a separate batch.

Works for me.  

Would you like me to start sending those as a separate series, or shall
I keep it as one series and let you split it as you choose?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html