Re: Why Git is so fast (was: Re: Eric Sink's blog - notes on git, dscms and a "whole product" approach)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 30 Apr 2009, Jeff King wrote:
> 
> Like all generalizations, this is only mostly true. Fast network servers
> with big caches can outperform disks for some loads.

That's _very_ few loads.

It doesn't matter how good a server you have, network filesystems 
invariably suck.

Why? It's not that the network or the server sucks - you can easily find 
beefy NAS setups that have big raids etc and are much faster than most 
local disks.

And they _still_ suck.

Simple reason: caching. It's a lot easier to cache local filesystems. Even 
modern networked filesystems (ie NFSv4), that do a pretty good job on a 
file-per-file basis with delegations etc, and they still tend to suck 
horribly at metadata.

In contrast, a workstation with local filesystems and enough memory to 
cache it well will just be a lot nicer.

> So I wouldn't rule out the possibility of a pleasant VCS experience on a
> network-optimized system backed by beefy servers on a local network.

Hey, you can always throw resources at it.

But no:

> I have never used perforce, but I get the impression that it is more 
> optimized for such a situation.

I doubt it. I suspect git will outperform pretty much anything else in 
that kind of situation too.

One thing that git does - and some other VCS's avoid - is to actually 
stat() the whole working tree in order to not need special per-file "I use 
this file" locking semantics. That can in theory make git slower over a 
network filesystem than such (very broken) alternatives.

If your VCS requires that you mark all files for editing somehow (ie you 
can't just use your favourite editor or scripting to modify files, but 
have to use "p4 edit" to say that you're going to write to the file, and 
the file is otherwise read-only), then such a VCS can - by being annoying 
and in your way - do some things faster than git can.

And yes, perforce does that (the "p4 edit" command is real, and exists).

And yes, in theory that can probably mean that perforce doesn't care so 
much about the metadata caching problem on network filesystems - because 
p4 will maintain some file of its own that contains the metadata.

But I suspect that the git "async stat" ("core.preloadindex") thing means 
that git will kick p4 *ss even on that benchmark, and be a whole lot more 
pleasant to use. Even on networked filesystems.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]