Re: [PATCH 00/17] Remove assumptions about refname lifetimes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 20, 2013 at 6:37 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Johan Herland <johan@xxxxxxxxxxx> writes:
>> For server-class installations we need ref storage that can be read
>> (and updated?) atomically, and the current system of loose + packed
>> files won't work since reading (and updating) more than a single file
>> is not an atomic operation. Trivially, one could resolve this by
>> dropping loose refs, and always using a single packed-refs file, but
>> that would make it prohibitively expensive to update refs (the entire
>> packed-refs file must be rewritten for every update).
>>
>> Now, observe that we don't have these race conditions in the object
>> database, because it is an add-only immutable data store.
>>
>> What if we stored the refs as a tree object in the object database,
>> referenced by a single (loose) ref?
>
> What is the cost of updating a single branch with that scheme?
>
> Doesn't it end up recording roughly the same amount of information
> as updating a single packed-refs file that is flat, but with the
> need to open a few tree objects (top-level, refs/, and refs/heads/),
> writing out a blob that stores the object name at the tip, computing
> the updated trees (refs/heads/, refs/ and top-level), and then
> finally doing the compare-and-swap of that single loose ref?

Yes, except that when you update packed-refs, you have to write out
the _entire_ file, whereas with this scheme you only have to write out
the part of the refs hierarchy you actually touched (e.g. rewriting
refs/heads/foo would not have to write out anything inside
refs/tags/*). If you have small number of branches, and a huge number
of tags, this scheme might end up being cheaper than writing the
entire packed-refs. But in general, it's probably much more expensive
to go via the odb.

> You may guarantee atomicity but it is the same granularity of
> atomicity as a single packed-refs file.

Yes, as I argued elsewhere in this thread: It seems that _any_
filesystem-based solution must resort to having all updates depend on
a single file in order to guarantee atomicity.

> When you are updating a
> branch while somebody else is updating a tag, of course you do not
> have to look at refs/tags/ in your operation and you can write out
> your final packed-refs equivalent tree to the object database
> without racing with the other process, but the top-level you come up
> with and the top-level the other process comes up with (which has
> an up-to-date refs/tags/ part, but has a stale refs/heads/ part from
> your point of view) have to race to update that single loose ref,
> and one of you have to back out.

True.

> That "backing out" can be made more intelligently than just dying
> with "compare and swap failed--please retry" message, e.g. you at
> that point notice what you started with, what the other party did
> while you were working on (i.e. updating refs/tags/), and three-way
> merge the refs tree, and in cases where "all refs recorded as loose
> refs" scheme wouldn't have resulted in problematic conflict, such a
> three-way merge would resolve trivially (you updated refs/heads/ and
> the update by the other process to refs/tags/ would not conflict
> with what you did).  But the same three-way merge scheme can be
> employed with the current flat single packed-refs scheme, can't it?

Yes. (albeit without reusing the machinery we already have for doing
three-way merges)

> Even worse, what is the cost of looking up the value of a single
> branch?  You would need to open a few tree objects and the leaf blob
> that records the object name the ref points at, wouldn't you?

Yes. Probably a showstopper, although single branch/ref lookup might
not be so common on server side, as it is on the user side.

> Right now, such a look-up is either opening a single small file and
> reading the first 41 bytes off of it, and falling back (when the ref
> in question is packed) to read a single packed-refs file and finding
> the ref you want from it.
>
> So...

Yep...


...Johan

-- 
Johan Herland, <johan@xxxxxxxxxxx>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]