Re: [PATCH] Adding a cache of commit to patch-id pairs to speed up git-cherry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Sat, 7 Jun 2008, Geoffrey Irving wrote:

> On Mon, Jun 2, 2008 at 11:15 AM, Johannes Schindelin
> <Johannes.Schindelin@xxxxxx> wrote:
>
> > On Mon, 2 Jun 2008, Geoffrey Irving wrote:
> >
> >> On Mon, Jun 2, 2008 at 9:18 AM, Johannes Schindelin
> >> <Johannes.Schindelin@xxxxxx> wrote:
> >>
> >> > On Mon, 2 Jun 2008, Geoffrey Irving wrote:
> >> >
> >> >> On Mon, Jun 2, 2008 at 8:37 AM, Johannes Schindelin
> >> >> <Johannes.Schindelin@xxxxxx> wrote:
> >> >>
> >> >> > Another issue that just hit me: this cache is append-only, so if 
> >> >> > it grows too large, you have no other option than to scratch and 
> >> >> > recreate it. Maybe this needs porcelain support, too?  (git gc?)
> >> >>
> >> >> If so, the correct operation is to go through the hash and remove 
> >> >> entries that refer to commits that no longer exist.  I can add 
> >> >> this if you want.  Hopefully somewhere along the way git-gc 
> >> >> constructs an easy to traverse list of extant commits, and this 
> >> >> will be straightforward.
> >> >
> >> > I don't know... if you have created a cached patch-id for every 
> >> > commit (by mistake, for example) and do not need it anymore, it 
> >> > might make git-cherry substantially faster to just scrap the cache.
> >>
> >> Well, ideally hash maps are O(1), but it could be a difference 
> >> between a "compare 40 bytes" constant and a "read a 4k block into 
> >> memory" constant, so in practice yes.  Scrapping it entirely will 
> >> also make the implementation much simpler.
> >>
> >> It seems a little sad to wipe all that effort each time, but 
> >> regenerating the cache is likely to be less expensive than a git-gc, 
> >> so it shouldn't change any amortized complexities.
> >
> > Well, how about only scrapping the cache if it is older than, say, 2 
> > weeks, and is larger than, say, 200kB?  That should help.
> 
> That heuristic is insufficient, since it doesn't do anything in the 
> normal case where a new entry appears every few days (e.g., when syncing 
> between two branches with cherry-pick).

Right, it is insufficient in such a case, but then, it does not really 
matter, methinks.  The cache is small enough anyway, and I think that many 
people will not really use it as much as you do.

However, I realized one very real issue with your patch: you do not 
provide a way to _disable_ the caching.  I think at least a config 
variable is needed, and while at it, a fallback when you cannot write to 
the repository.

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux