Re: Git and GCC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/6/07, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>
> On Thu, 6 Dec 2007, Daniel Berlin wrote:
> >
> > Actually, it turns out that git-gc --aggressive does this dumb thing
> > to pack files sometimes regardless of whether you converted from an
> > SVN repo or not.
>
> Absolutely. git --aggressive is mostly dumb. It's really only useful for
> the case of "I know I have a *really* bad pack, and I want to throw away
> all the bad packing decisions I have done".
>
> To explain this, it's worth explaining (you are probably aware of it, but
> let me go through the basics anyway) how git delta-chains work, and how
> they are so different from most other systems.
>
I worked on Monotone and other systems that use object stores. for a
little while :)
In particular, I believe GIT's original object store was based on
Monotone, IIRC.

> In other SCM's, a delta-chain is generally fixed. It might be "forwards"
> or "backwards", and it might evolve a bit as you work with the repository,
> but generally it's a chain of changes to a single file represented as some
> kind of single SCM entity. In CVS, it's obviously the *,v file, and a lot
> of other systems do rather similar things.

>
> Git also does delta-chains, but it does them a lot more "loosely". There
> is no fixed entity. Delta's are generated against any random other version
> that git deems to be a good delta candidate (with various fairly
> successful heursitics), and there are absolutely no hard grouping rules.

Sure. SVN actually supports this (surprisingly), it just never happens
to choose delta bases that aren't related by ancestry.  (IE it would
have absolutely no problem with you using random other parts of the
repository as delta bases, and i've played with it before).

I actually advocated we move towards an object store model, as
ancestry can be a  crappy way of approximating similarity when you
have a lot of branches.

> So the equivalent of "git gc --aggressive" - but done *properly* - is to
> do (overnight) something like
>
>         git repack -a -d --depth=250 --window=250
>
I gave this a try overnight, and it definitely helps a lot.
Thanks!

> And then it's going to take forever and a day (ie a "do it overnight"
> thing). But the end result is that everybody downstream from that
> repository will get much better packs, without having to spend any effort
> on it themselves.
>

If your forever and a day is spent figuring out which deltas to use,
you can reduce this significantly.
If it is spent writing out the data, it's much harder. :)
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux