Re: Why are ref_lists sorted?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Julian Phillips <julian@xxxxxxxxxxxxxxxxx> wrote:
> A bit of investigation showed this to be due to the first attempt to read 
> a ref causing the packed refs to be loaded.  In my test repo the 
> packed-refs file has over 9000 entries, but I still thought that it would 
> load faster than that.  It turns out that the overhead is from sorting the 
> refs when building the ref_list.  If I remove the code for sorting the 
> entries I lose that initial 1s delay, without appearing to break anything 
> in the fetch.  However I assume that it's there for a reason ...
> 
> So my questions are:
> 
> 1) what have I broken by removing the sort?
> 2) is it worth trying to optimise the sort?

Oh gawd, that thing is an O(n**2) insertion.  Ouch.

The entire reason its sorted is just to allow us to find the
existing ref entry, if one exists, and replace it.  This way
a loose (unpacked) ref will shadow/override its packed version.

I think the other reason is to provide predictable behavior.
By making the list sorted, refs always appear at the same relative
position to other refs within that repository.  This makes it easier
to write the packed-refs file and yet keep the ordering within the
file predictable.

I think this could probably be better done as a hash, not as a
linked list, and that only the packed-refs writer needs to pay
any sort of sorting costs.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]