Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> wrote: > The new clone has: > > <-- > $ ls -ltr .git/svn/.caches/ > total 144788 > -rw-rw-r--. 1 Hin-Tak Hin-Tak 1166138 Oct 7 13:44 lookup_svn_merge.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 72849741 Oct 7 13:48 check_cherry_pick.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 1133855 Oct 7 13:49 has_no_changes.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 73109005 Oct 7 13:53 _rev_list.yaml > --> > > The old clone has: <snip> > -rw-rw-r--. 1 Hin-Tak Hin-Tak 40241189 Oct 5 16:42 lookup_svn_merge.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 225323456 Oct 5 16:49 check_cherry_pick.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 242547 Oct 5 16:49 has_no_changes.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 24120007 Oct 5 16:50 _rev_list.yaml > --> > > I had to suspend somewhat around r59000 - but it is interesting to see > that the max memory consumption of the later part is almost double? > and it also runs at 100% rather than 60% overall; I don't know what > to make of that - probably just smaller changes versus > larger ones, or different time of day and network loads (yes, > I guess it is just bandwidth-limited?, since the bulk of CPU time is in system > rather than user). git-svn memory usage is insane, and we need to reduce it. (on Linux, fork() performance is reduced as memory size of the parent grows, and I don't think we can easily call vfork() from Perl) > I am somwhat worry about the dramatic difference between the two .svn/.caches - > check_cherry_pick.yaml is 225MB in one and 73MB in the other, and also > _rev_list.yaml is opposite - 24MB vs 73MB. How do I reconcile that? Calling patterns changed, and it looks like Jakob's changes avoided some calls. The main thing to care about: Does the repository history look right? The check_cherry_pick cache can be made smaller, too: ----------------------- 8< ----------------------------- From: Eric Wong <normalperson@xxxxxxxx> Subject: [PATCH] git-svn: reduce check_cherry_pick cache overhead We do not need to store entire lists of commits, only the number of incomplete and the first commit for reference. This reduces the amount of data we need to store in memory and on disk stores. Signed-off-by: Eric Wong <normalperson@xxxxxxxx> --- perl/Git/SVN.pm | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm index 25dbcd5..b2d37cb 100644 --- a/perl/Git/SVN.pm +++ b/perl/Git/SVN.pm @@ -1537,7 +1537,7 @@ sub _rev_list { @rv; } -sub check_cherry_pick { +sub check_cherry_pick2 { my $base = shift; my $tip = shift; my $parents = shift; @@ -1552,7 +1552,8 @@ sub check_cherry_pick { delete $commits{$commit}; } } - return (keys %commits); + my @k = (keys %commits); + return (scalar @k, $k[0]); } sub has_no_changes { @@ -1597,7 +1598,7 @@ sub tie_for_persistent_memoization { mkpath([$cache_path]) unless -d $cache_path; my %lookup_svn_merge_cache; - my %check_cherry_pick_cache; + my %check_cherry_pick2_cache; my %has_no_changes_cache; my %_rev_list_cache; @@ -1608,11 +1609,11 @@ sub tie_for_persistent_memoization { LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache], ; - tie_for_persistent_memoization(\%check_cherry_pick_cache, - "$cache_path/check_cherry_pick"); - memoize 'check_cherry_pick', + tie_for_persistent_memoization(\%check_cherry_pick2_cache, + "$cache_path/check_cherry_pick2"); + memoize 'check_cherry_pick2', SCALAR_CACHE => 'FAULT', - LIST_CACHE => ['HASH' => \%check_cherry_pick_cache], + LIST_CACHE => ['HASH' => \%check_cherry_pick2_cache], ; tie_for_persistent_memoization(\%has_no_changes_cache, @@ -1636,7 +1637,7 @@ sub tie_for_persistent_memoization { $memoized = 0; Memoize::unmemoize 'lookup_svn_merge'; - Memoize::unmemoize 'check_cherry_pick'; + Memoize::unmemoize 'check_cherry_pick2'; Memoize::unmemoize 'has_no_changes'; Memoize::unmemoize '_rev_list'; } @@ -1648,7 +1649,8 @@ sub tie_for_persistent_memoization { return unless -d $cache_path; for my $cache_file (("$cache_path/lookup_svn_merge", - "$cache_path/check_cherry_pick", + "$cache_path/check_cherry_pick", # old + "$cache_path/check_cherry_pick2", "$cache_path/has_no_changes")) { for my $suffix (qw(yaml db)) { my $file = "$cache_file.$suffix"; @@ -1817,15 +1819,15 @@ sub find_extra_svn_parents { } # double check that there are no missing non-merge commits - my (@incomplete) = check_cherry_pick( + my ($ninc, $ifirst) = check_cherry_pick2( $merge_base, $merge_tip, $parents, @all_ranges, ); - if ( @incomplete ) { - warn "W:svn cherry-pick ignored ($spec) - missing " - .@incomplete." commit(s) (eg $incomplete[0])\n"; + if ($ninc) { + warn "W:svn cherry-pick ignored ($spec) - missing " . + "$ninc commit(s) (eg $ifirst)\n"; } else { warn "Found merge parent ($spec): ", -- EW -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html