Re: VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Marat Radchenko venit, vidit, dixit 13.07.2010 08:56:
> Hi.
> 
> My setup:
> 0. Quad-code machine with 8GB of ram, 10K RPM hdd.
> 1. SVN repo that i periodically fetch into origin/trunk branch. Has ~200 
> commits/day.
> 2. My local branch with 1-5 commits which i often rebase against trunk.
> 3. I haven't rebased for 2 days, so i'm rebasing 3 (three) commits in my branch 
> over 453 commits in trunk using "git rebase trunk".
> 4. trunk does contain "bad" from diff POV files (big & binary).
> 5. Sadly, data in repo is confidential.
> 
> Expected: rebase takes some reasonable amount of time (< 1 min?).
> 
> Actual: rebase takes 20 mins.
> 
> Almost all of that time was spent doing `git format-patch -k --stdout --full-
> index --ignore-if-in-upstream 
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 52` (that's three commits from my branch) at 100% of one CPU core.
> 
> Additional info:
> 
> Another similar rebase but over 4.5k of commits took 2 hours.
> 
> Running without --ignore-if-in-upstream:
> $ time git format-patch -k --stdout --full-index 
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 5 | wc -l
> 25823
> Is it 
> real	0m0.163s
> user	0m0.140s
> sys	0m0.020s
> 
> Proof there are only three commits:
> 
> $ git rev-list 
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 52d3fde4ae7497981a6fe61b0366b105477896cf52
> e18069258806bda6a6165822003f5e9fd958f906
> c8c2f2e157e615b73d0baab1d793a22991c9ba71
> 
> Questions:
> 1. Is it expected behavior (branch you rebase onto has binary files -> no 
> performance for you)?

Well, with "ignore-if-in-upstream" git has to compute a patch-id for
every upstream patch (merge-base..upstream) and compare to the ids of
the commits in mb..HEAD.

> 2. If [1] is yes, is it possible to prevent rebase from running --ignore-if-in-
> upstream?

Not currently, but with my upcoming patch ;)

This has the (side-) effect of not ignoring patches which have been
applied (with different sha1) upstream, of course.

> 3. If [1] is no, should i run some kind of profiler (how?) to determine what 
> exactly causes such performance drop?

It is the calculation of the patch-ids. Git first creates a "binary
diff" and then computes the patch-id (sha1) of that diff. I am sure we
could optimize the calculation of patch-ids for binary diffs, which may
be useful in addition to shutting off "cherry" with rebase.

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]