Re: [PATCH 2/2] Implement a simple delta_base cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 17 Mar 2007, Linus Torvalds wrote:

> 
> 
> On Sat, 17 Mar 2007, Nicolas Pitre wrote:
> > 
> > Well... in my opinion it is the _current_ tree walker that is quite ugly 
> > and complex.  It is always messier to parse strings than fixed width 
> > binary fields.
> 
> Sure. On the other hand, text is what made things easy to do initially,

Oh indeed.  No argument there.

> and you're missing one *BIG* clue: you cannot remote the support without 
> losing compatibility with all traditional object formats.
> 
> So you have no choice. You need to support the text representation. As a 
> result, *your* code will now be way more ugly and messy.

Depends. We currently have separate parsers for trees, commits, tags, 
etc.  That should be easy enough to add another (separate) parser for 
new tree objects while still having a common higher level accessor 
interface like tree_entry().

But right now we only regenerate the text representation whenever the 
binary representation is encountered just to make things easy to do, and 
yet we still have a performance gain already in _addition_ to a net 
saving in disk footprint.

> The thing is, parsing some little text may sound expensive, but if the 
> expense is in finding the end of the string, we're doing really well.

Of course the current tree parser will remain, probably forever.  And it 
is always a good thing to optimize it further when ever possible.

> But what you're ignoring here is that "16%" may sound like a huge deal, 
> but it's 16% of somethng that takes 1 second, and that other SCM's cannot 
> do AT ALL.

Sure.  But at this point the reference to compare GIT performance 
against might be GIT itself.  And while 1 second is really nice in this 
case, there are some repos where it could be (and has already been 
reported to be) much more.

I still have a feeling that we can do even better than we do now.  Much 
much better than 16% actually.  But that require a new data format that 
is designed for speed.

We'll see.


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]