On Mon, May 20, 2013 at 6:37 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Johan Herland <johan@xxxxxxxxxxx> writes: >> For server-class installations we need ref storage that can be read >> (and updated?) atomically, and the current system of loose + packed >> files won't work since reading (and updating) more than a single file >> is not an atomic operation. Trivially, one could resolve this by >> dropping loose refs, and always using a single packed-refs file, but >> that would make it prohibitively expensive to update refs (the entire >> packed-refs file must be rewritten for every update). >> >> Now, observe that we don't have these race conditions in the object >> database, because it is an add-only immutable data store. >> >> What if we stored the refs as a tree object in the object database, >> referenced by a single (loose) ref? > > What is the cost of updating a single branch with that scheme? > > Doesn't it end up recording roughly the same amount of information > as updating a single packed-refs file that is flat, but with the > need to open a few tree objects (top-level, refs/, and refs/heads/), > writing out a blob that stores the object name at the tip, computing > the updated trees (refs/heads/, refs/ and top-level), and then > finally doing the compare-and-swap of that single loose ref? Yes, except that when you update packed-refs, you have to write out the _entire_ file, whereas with this scheme you only have to write out the part of the refs hierarchy you actually touched (e.g. rewriting refs/heads/foo would not have to write out anything inside refs/tags/*). If you have small number of branches, and a huge number of tags, this scheme might end up being cheaper than writing the entire packed-refs. But in general, it's probably much more expensive to go via the odb. > You may guarantee atomicity but it is the same granularity of > atomicity as a single packed-refs file. Yes, as I argued elsewhere in this thread: It seems that _any_ filesystem-based solution must resort to having all updates depend on a single file in order to guarantee atomicity. > When you are updating a > branch while somebody else is updating a tag, of course you do not > have to look at refs/tags/ in your operation and you can write out > your final packed-refs equivalent tree to the object database > without racing with the other process, but the top-level you come up > with and the top-level the other process comes up with (which has > an up-to-date refs/tags/ part, but has a stale refs/heads/ part from > your point of view) have to race to update that single loose ref, > and one of you have to back out. True. > That "backing out" can be made more intelligently than just dying > with "compare and swap failed--please retry" message, e.g. you at > that point notice what you started with, what the other party did > while you were working on (i.e. updating refs/tags/), and three-way > merge the refs tree, and in cases where "all refs recorded as loose > refs" scheme wouldn't have resulted in problematic conflict, such a > three-way merge would resolve trivially (you updated refs/heads/ and > the update by the other process to refs/tags/ would not conflict > with what you did). But the same three-way merge scheme can be > employed with the current flat single packed-refs scheme, can't it? Yes. (albeit without reusing the machinery we already have for doing three-way merges) > Even worse, what is the cost of looking up the value of a single > branch? You would need to open a few tree objects and the leaf blob > that records the object name the ref points at, wouldn't you? Yes. Probably a showstopper, although single branch/ref lookup might not be so common on server side, as it is on the user side. > Right now, such a look-up is either opening a single small file and > reading the first 41 bytes off of it, and falling back (when the ref > in question is packed) to read a single packed-refs file and finding > the ref you want from it. > > So... Yep... ...Johan -- Johan Herland, <johan@xxxxxxxxxxx> www.herland.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html