Re: [PATCH 2/2] Implement a simple delta_base cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sun, 18 Mar 2007, Julian Phillips wrote:
> 
> (This is a rather unrealistic repository consisting of a long series of
> commits of new binary files, but I don't have access to the repository that is
> being approximated until I get back to work on Monday ...)

This is a *horrible* test repo.

Is this actually really trying to approximate anything you work with? If 
so, please check whether you have cyanide or some other effective poison 
to kill all your cow-orkers - it's really doing them a favor - and then do 
the honorable thing yourself? Use something especially painful on whoever 
came up with the idea to track 25000 files in a single directory.

I'll see what the profile is, but even without the repo full generated 
yet, I can already tell you that you should *not* put tens of thousands of 
files in a single directory like this.

It's not only usually horribly bad quite independently of any SCM issues 
(ie most filesystems will have some bad performance behaviour with things 
like this - if only because "readdir()" will inevitably be slow).

And for git it means that you lose all ability to efficiently prune away 
the parts of the tree that you don't care about. git will always end up 
working with a full linear filemanifest instead of a nice collection of 
recursive trees, and a lot of the nice tree-walking optimizations that git 
has will just end up being no-ops: each tree is always one *huge* 
manifest.

So it's not that git cannot handle it, it's that a lot of the nice things 
that make git really efficient simply won't trigger for your repository.

In short: avoiding tens of thousands of files in a single directory is 
*always* a good idea. With or without git.

(Again, SCM's that are really just "one file at a time" like CVS, 
won't care as much. They never really track all files anyway, so while 
they are limited by potential filesystem performance bottlenecks, they 
won't have the fundamental issue of tracking 25,000 files..)

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]