Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> My idea with that was that you already have a SHA-1 map in the pack index, 
> and if all you want to be able to accelerate the revision walker, you'd 
> probably need something that adds yet another mapping, from commit to 
> parents and tree, and from tree to sub-tree and blob (so you can avoid 
> unpacking commit and tree objects).
>
> I just thought that it could be more efficient to do it at the time the 
> pack index is written _anyway_, as nothing will change in the pack after 
> that anyway.

After reading the version 2 of the "documentation" patch and commenting
heavily on it, I partly share the same feeling with you.  The codepath to
pack objects is _one of the places_ you can generate rev-cache and slice
information without redoing a lot of work that has already been done
anyway.

But

 - You can write that information separately out to a different file.
   Logically it does not have to be _in_ the same pack idx file; and

 - You may want to generate rev-cache information even if you do not pack
   the repository.  They may practically go hand-in-hand, but logically
   they are orthogonal.

And I am not sure if it is easy to retrofit "rev-list | pack-objects" code
to additionally produce this information, while keeping the standalone
version of rev-cache generation.

Having said all that.

I haven't read the side of the patch that _uses_ the information stored in
the rev-cache to figure out what it optimizes and what its limitations are
(e.g. how it interacts with pathspecs).  Perhaps the rev-cache may turn
out to be _only_ useful for pack-objects and nothing else, in which case
we may not care about standalone version of rev-cache generator after all.

If that is the case, I think it is also a reasonable implementation if the
rev-cache is generated only by "rev-list | pack-objects" codepath as a
side effect of traversal it already does, and it might even make sense to
introduce the version 3 of pack idx format that let you record additional
information, like you suggest.  I am not ready to make that judgement as I
haven't read the rest, but my gut feeling tells me that you might be
right.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]