Signed-off-by: Nguyán ThÃi Ngác Duy <pclouds@xxxxxxxxx> --- I wanted to make a more detailed description, per command. It would serve as guidance for people on special repos, also as TODOs for Git developers. But that seems a lot of work on analyzing each commands. Instead I made this text to warn users where performance may decrease, and to hint them features that might help. Do I miss anything? There were discussions in the past on maintaining large files out-of-repo, and symlinks to them in-repo. That sounds like a good advice, doesn't it? Documentation/git.txt | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 46 insertions(+), 0 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index dd57bdc..8408923 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -729,6 +729,52 @@ The index is also capable of storing multiple entries (called "stages") for a given pathname. These stages are used to hold the various unmerged version of a file when a merge is in progress. +Performance concerns +-------------------- + +Git is written with performance in mind and it works extremely well +with its typical repositories (i.e. source code repositories, with +a moderate number of small text files, possibly with long history). +Non-typical repositories (huge number of files, or very large +files...) may experience performance degradation. This section describes +how Git behaves in such repositories and how to reduce impact. + +For repositories with really long history, you may want to work on +a shallow clone of it (see linkgit:git-clone[1], option '--depth'). +A shallow repository does not contain full history, so it may consume +less disk space and network bandwidth. On the other hand, you cannot +fetch from it. And obviously you cannot look further back than what +it has in history (you can deepen history though). + +For repositories with a large number of files, but you only need +a few of them present in working tree, you can use sparse checkout +(see linkgit:git-read-tree[1], section 'Sparse checkout'). Sparse +checkout can be used with either a normal repository, or a shallow +one. + +Git uses lstat(3) to detect changes in working tree. A huge number +of lstat(3) calls may impact performance, especially on systems with +slow lstat(3). In some cases you can reduce the number of lstat(3) +calls by specifying what directories you are interested in, so no +lstat(3) outside is needed. + +For repositories with a large number of files, you need all of them +present in working tree, but you know in advance only a few of them +may be modified, please consider using assume-unchanged bit (see +linkgit:git-update-index[1]). This helps reduce the number of lstat(3) +calls. + +Some Git commands need entire file content in memory to process. +You may want to avoid using them if possible on large files. Those +commands include: + +* All checkout commands (linkgit:git-checkout[1], + linkgit:git-checkout-index[1], linkgit:git-read-tree[1], + linkgit:git-clone[1]...) +* All diff-related commands (linkgit:git-diff[1], + linkgit:git-log[1] with diff, linkgit:git-show[1] on commits...) +* All commands that need file conversion processing + Authors ------- * git's founding father is Linus Torvalds <torvalds@xxxxxxxx>. -- 1.7.0.2.445.gcbdb3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html