Re: Benchmarks regarding git's gc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 08, 2011 at 10:40:15AM -0600, Brandon Casey wrote:

> On Tue, Nov 8, 2011 at 5:34 AM, Felipe Contreras
> <felipe.contreras@xxxxxxxxx> wrote:
> > Has anybody seen these?
> > http://draketo.de/proj/hg-vs-git-server/test-results.html#results
> >
> > Seems like a potential area of improvement.
> 
> I think this is a case of designing the problem space so that your
> intended winner wins and your intended loser loses.

Sort of. It is a real problem space, and mercurial does have some
advantage in that area.

His problem definition is that of a git-backed server database that is
under constant load creating new commits. So imagine wikipedia backed by
git.

Mercurial's strategy (as I understand it) is to always calculate and
store deltas as new commits are created. Git's strategy is to store full
objects, and then worry about deltification later. So of course git is
going to do more work, and especially more I/O.

Git's strategy is fine for the workload for which it was designed:
people making commits in burst, and occasionally doing book-keeping to
make things smaller.

But for a constant-commit workflow, the burstiness is annoying, and the
amount of I/O can be cumbersome.  We realized this long ago when
importing old histories into git. And that's why fast-import was born:
it does at least a minimal level of delta and puts everything into a
single packfile, instead of writing out loose objects.

If you were writing commits at some fast constant rate into your
repository, then you'd probably want to do the same thing. And it would
be fairly easy to do on top of git's object model. At best, it's just a
specialized commit command (like fast-import), and at worst it's
probably a more incremental object store.

So he may have a point that mercurial might perform better for some
metrics than git in the current state. But I think a lot of that is
because nobody has bothered putting git into this situation and done the
tweaks needed to make it fast. You can argue that git sucks because it
needs tweaking, of course, but if I were picking between the two systems
to implement something like this, I'd consider picking git and doing the
tweaks (of course, I'm far from unbiased).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]