Re: git performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you are really trying to backup a filesystem, you may want to look
at a filesystem that can do snapshots, it would be a lot more
efficient then a version control system.  Such as NILFS and ZFS.

http://en.wikipedia.org/wiki/NILFS
http://en.wikipedia.org/wiki/ZFS

Both these will allow you to look at changed files over time. NILFS is
slightlly diffrent in that it doesn't take snapshots, because it never
deletes, so you can rollback every change on a file. They both also
allow each user to rollback their own files if they wanted to, so if
this is your goal, source code version control is not for you, and a
good file system is for you.

-G

On Fri, Oct 24, 2008 at 10:29 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Fri, Oct 24, 2008 at 12:15:19AM -0400, Edward Ned Harvey wrote:
>
>> Feel free to forward to the list, if anyone's still talking about it.
>> I already un-subscribed.
>
> Posting is not limited to subscribers, so you can happily continue the
> conversation there by cc'ing the list (and I am cc'ing the list here).
>
>> I did my benchmarking at least two months ago, so I forgot the exact
>> results now, so I ran the benchmark once just now.  I also downloaded
>> git, and did "git status" for comparison.  I rebooted the system in
>> between each trial run, to clear the cache.  Here's the results:
>
> Side note: on Linux, it is much easier to clear the cache via
>
>  echo 1 >/proc/sys/vm/drop_caches
>
> than to reboot for each benchmark.
>
>> Local disk mirror "time git status" on the same tree. 17,468 versioned files, so the whole tree is 30,647 including .git files
>>       0m 25s  cold cache
>>       0m 0.2s warm cache trial 1
>>       0m 0.2s warm cache trial 2
>
> Hmm. That's a lot of increase in files for .git. Did you try repacking
> and then running your test?
>
>> I questioned whether svn and git were causing unnecessary overhead.
>
> Sure, they are doing more than just walking. So there is overhead, but
> it's hard to say how much is unnecessary. However, if you were working
> with an unpacked git, then it may have had to open() a lot of files in
> the object db (keep in mind that status doesn't just show the difference
> between the working tree and the index; it shows the difference between
> the index and the last commit. So maybe "git diff" would be a more
> accurate comparison).
>
>> Conclusions:
>> * For "status" operations on cold cache, large file count, Neither the
>> performance of git or svn approaches the ideal.  Both are an order of
>> magnitude slower than ideal, which is still assuming "ideal" requires
>> walking the tree.  A better ideal avoids the need to walk the tree,
>> and has near-zero total cost.
>
> Try your git benchmark again with a packed repo, and I think you will
> find it approaches the time it takes to walk the tree.
>
> That being said, if walking the tree is unacceptable to you, then no,
> current git won't work. You would need to patch it to use inotify (once
> upon a time there was some discussion of this, but it never went
> anywhere -- I guess most people work on machines where they can keep the
> cache relatively warm).
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux