Re: [PATCH] Adding a cache of commit to patch-id pairs to speed up git-cherry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 2, 2008 at 11:15 AM, Johannes Schindelin
<Johannes.Schindelin@xxxxxx> wrote:
> Hi,
>
> On Mon, 2 Jun 2008, Geoffrey Irving wrote:
>
>> On Mon, Jun 2, 2008 at 9:18 AM, Johannes Schindelin
>> <Johannes.Schindelin@xxxxxx> wrote:
>>
>> > On Mon, 2 Jun 2008, Geoffrey Irving wrote:
>> >
>> >> On Mon, Jun 2, 2008 at 8:37 AM, Johannes Schindelin
>> >> <Johannes.Schindelin@xxxxxx> wrote:
>> >>
>> >> > Another issue that just hit me: this cache is append-only, so if it
>> >> > grows too large, you have no other option than to scratch and
>> >> > recreate it. Maybe this needs porcelain support, too?  (git gc?)
>> >>
>> >> If so, the correct operation is to go through the hash and remove
>> >> entries that refer to commits that no longer exist.  I can add this
>> >> if you want.  Hopefully somewhere along the way git-gc constructs an
>> >> easy to traverse list of extant commits, and this will be
>> >> straightforward.
>> >
>> > I don't know... if you have created a cached patch-id for every commit
>> > (by mistake, for example) and do not need it anymore, it might make
>> > git-cherry substantially faster to just scrap the cache.
>>
>> Well, ideally hash maps are O(1), but it could be a difference between a
>> "compare 40 bytes" constant and a "read a 4k block into memory"
>> constant, so in practice yes.  Scrapping it entirely will also make the
>> implementation much simpler.
>>
>> It seems a little sad to wipe all that effort each time, but
>> regenerating the cache is likely to be less expensive than a git-gc, so
>> it shouldn't change any amortized complexities.
>
> Well, how about only scrapping the cache if it is older than, say, 2
> weeks, and is larger than, say, 200kB?  That should help.

That heuristic is insufficient, since it doesn't do anything in the
normal case where a new entry appears every few days (e.g., when
syncing between two branches with cherry-pick).

I don't know what the best alternative is, so I left garbage
collection out of the patch I just submitted.  We can add it once we
decide what to do.  I'm not sure it's a serious problem: if you
"accidentally" added entries for all commits in the git tree, the file
is still under 1M.

Geoffrey
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux