Re: two questions about the format of loose object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shawn O. Pearce wrote:
> Liu Yubao <yubao.liu@xxxxxxxxx> wrote:
>> In current implementation the loose objects are compressed:
>>
>>      loose object = deflate(typename + <space> + size + '\0' + data)
> ...
>> * Question 1:
>>
>> Why not use the format below for loose object?
>>     loose object = typename + <space> + size + '\0' + deflate(data)
> 
> Historical accident.  We really should have used a format more
> like what you are asking here, because it makes inflation easier.
> The pack file format uses a header structure sort of like this,
> for exactly that reason.  IOW we did learn our mistakes and fix them.
> 
> If you look up the new style loose object code you'll see that it
> has a format like this (sort of), the header is actually the same
> format that is used in the pack files, making it smaller than what
> you propose but also easier to unpack as the code can be reused
> with the pack reading code.
> 
> Unfortunately the new style loose object was phased out; it never
> really took off and it made the code much more complex.  So it was
> pulled in commit 726f852b0ed7e03e88c419a9996c3815911c9db1:
> 

In fact the format I proposed in my patches is uncompressed loose
object, not uncompressed loose object header, that's to say I
proposed format 2 in my question 2, I am just curious why the
loose object header is compressed in question 1.

I did a test to add all files of git-1.6.1-rc1 with git-add, the
time spent decreased by half. Other commands like git diff,
git diff --cached, git diff HEAD~ HEAD should be faster now
although the change may be not noticable for small and medium project.


>  Author: Nicolas Pitre <nico@xxxxxxx>:
>  >  deprecate the new loose object header format
>  >
>  >  Now that we encourage and actively preserve objects in a packed form
>  >  more agressively than we did at the time the new loose object format and
>  >  core.legacyheaders were introduced, that extra loose object format
>  >  doesn't appear to be worth it anymore.
>  >
>  >  Because the packing of loose objects has to go through the delta match
>  >  loop anyway, and since most of them should end up being deltified in
>  >  most cases, there is really little advantage to have this parallel loose
>  >  object format as the CPU savings it might provide is rather lost in the
>  >  noise in the end.
>  >
>  >  This patch gets rid of core.legacyheaders, preserve the legacy format as
>  >  the only writable loose object format and deprecate the other one to
>  >  keep things simpler.
> 

Thank you for dig it out for me!


Best regards,

Liu Yubao
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux