Re: [PATCH 00/15] git-note: A mechanisim for providing free-form after-the-fact annotations on commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 29, 2007 at 09:06:29AM +0200, Johan Herland wrote:

> 1. Keep a file, ".git/reverse_tagmap_sorted" with one entry of the form
> "pointee pointer" per line. The file is sorted on "pointee", so we can
> easily do the radix-256-fan-out-followed-by-binary-search trick that
> Linus mentioned in another thread. This should hopefully make lookup
> fairly cheap. BTW, if there is a similar "pointee pointer"-type format
> already being used in git, I'd be happy to use that instead. I looked

I did a similar thing (though rather than having "lines" at all, they
were fixed-length pairs of binary sha1 hashes) to implement a negative
delta cache (which turned out to be a stupid idea, but the
implementation worked):

http://www.gelato.unsw.edu.au/archives/git/0606/23229.html

> 2. Keep another file, ".git/reverse_tagmap_unsorted" in front of (1).
> This file has exactly the same format, minus the sorting. It exists just
> to make insertion cheap. Once this file reaches a certain size (i.e.
> when trawling it on lookup becomes slightly painful), we shuffle the
> entries into the sorted file (this would happen automatically on
> insertion of an entry, and should _not_ have to be triggered by 'git-gc'
> etc.).

The implementation I mentioned above collects several "to be inserted"
entries (in memory) and then merge sorts the two lists into a new file.
So it was fast in terms of comparisons, but it involved writing O(n)
entries, which is probably bad for creating a single note.

> Of course, if we think insertion directly into (1) will never be too
> expensive, we can drop (2) altogether.

You can always find the right spot, memmove everything down one slot,
and insert. But that means:
  - the cost of insertion will be proportional to the number of items,
    whereas using an unsorted journal you get to amortize that cost over
    many insertions
  - if you update in place, you have to lock the db for reading while
    you are moving around. If you are always either appending to the
    journal or merging into a new file, you can avoid this.

-Peff
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux