Re: many files, simple history

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 14, 2010 at 00:05, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, May 13, 2010 at 10:57:22PM -0400, Ali Tofigh wrote:
>
>> short version: will git handle large number of files efficiently if
>> the history is simple and linear, i.e., without merges?
>
> Short answer: large number of files, yes, large files, not really. The
> shape of history is largely irrelevant.

thank you for the explanation. I will start using git for managing my
installed programs and will try to  report back to this list about my
experience.

/ali

>
> Longer answer:
>
> Git separates the conceptual structure of history (the digraph of
> commits, and the pointers of commits to trees to blobs) from the actual
> storage of objects representing that history. Problems with large files
> are usually storage issues. Copying them around in packfiles is
> expensive, storing an extra copy in the repo is expensive, trying deltas
> and diffs is expensive. None of those things has to do with the shape of
> your history. So I would expect git to handle such a load with a linear
> history about as well as a complex history with merges.
>
> For large numbers of files, git generally does a good job, especially if
> those files are distributed throughout a directory hierarchy. But keep
> in mind that the git repo will store another copy of every file. They
> will be delta-compressed between versions, and zlib compressed overall,
> but you may potentially be doubling the amount of disk space required if
> you have a lot of uncompressible binary files.
>
> For large files, git expects to be able to pull each file into memory.
> Sometimes two versions if you are doing a diff. And it will copy those
> files around when repacking (which you will want to do for the sake of
> the smaller files). So files on the order of a few megabytes are not a
> problem. If you have files in the hundreds of megabytes or gigabytes,
> expect some operations to be slow (like repacking).
>
> Really, I would start by just "git add"-ing your whole filesystem, doing
> a "git repack -ad", and seeing how long it takes, and what the resulting
> size is.
>
> -Peff
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]