------------------------------ On Sun, Oct 19, 2014 05:12 BST Eric Wong wrote: >Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> wrote: > The new clone has: > > <-- > $ ls -ltr .git/svn/.caches/ > total 144788 > -rw-rw-r--. 1 Hin-Tak Hin-Tak 1166138 Oct 7 13:44 lookup_svn_merge.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 72849741 Oct 7 13:48 check_cherry_pick.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 1133855 Oct 7 13:49 has_no_changes.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 73109005 Oct 7 13:53 _rev_list.yaml > --> > > The old clone has: > ><snip> > -rw-rw-r--. 1 Hin-Tak Hin-Tak 40241189 Oct 5 16:42 lookup_svn_merge.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 225323456 Oct 5 16:49 check_cherry_pick.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 242547 Oct 5 16:49 has_no_changes.yaml > -rw-rw-r--. 1 Hin-Tak Hin-Tak 24120007 Oct 5 16:50 _rev_list.yaml > --> > > I had to suspend somewhat around r59000 - but it is interesting to see > that the max memory consumption of the later part is almost double? > and it also runs at 100% rather than 60% overall; I don't know what > to make of that - probably just smaller changes versus > larger ones, or different time of day and network loads (yes, > I guess it is just bandwidth-limited?, since the bulk of CPU time is in system > rather than user). > >git-svn memory usage is insane, and we need to reduce it. >(on Linux, fork() performance is reduced as memory size of the parent > grows, and I don't think we can easily call vfork() from Perl) > > I am somwhat worry about the dramatic difference between the two .svn/.caches - > check_cherry_pick.yaml is 225MB in one and 73MB in the other, and also > _rev_list.yaml is opposite - 24MB vs 73MB. How do I reconcile that? > >Calling patterns changed, and it looks like Jakob's changes avoided some >calls. The main thing to care about: > Does the repository history look right? > >The check_cherry_pick cache can be made smaller, too: >----------------------- 8< ----------------------------- >From: Eric Wong <normalperson@xxxxxxxx> >Subject: [PATCH] git-svn: reduce check_cherry_pick cache overhead > >We do not need to store entire lists of commits, only the >number of incomplete and the first commit for reference. >This reduces the amount of data we need to store in memory >and on disk stores. > >Signed-off-by: Eric Wong <normalperson@xxxxxxxx> >--- > perl/Git/SVN.pm | 28 +++++++++++++++------------- > 1 file changed, 15 insertions(+), 13 deletions(-) > >diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm >index 25dbcd5..b2d37cb 100644 >--- a/perl/Git/SVN.pm >+++ b/perl/Git/SVN.pm >@@ -1537,7 +1537,7 @@ sub _rev_list { > @rv; > } > >-sub check_cherry_pick { >+sub check_cherry_pick2 { > my $base = shift; > my $tip = shift; > my $parents = shift; >@@ -1552,7 +1552,8 @@ sub check_cherry_pick { > delete $commits{$commit}; > } > } >- return (keys %commits); >+ my @k = (keys %commits); >+ return (scalar @k, $k[0]); > } > > sub has_no_changes { >@@ -1597,7 +1598,7 @@ sub tie_for_persistent_memoization { > mkpath([$cache_path]) unless -d $cache_path; > > my %lookup_svn_merge_cache; >- my %check_cherry_pick_cache; >+ my %check_cherry_pick2_cache; > my %has_no_changes_cache; > my %_rev_list_cache; > >@@ -1608,11 +1609,11 @@ sub tie_for_persistent_memoization { > LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache], > ; > >- tie_for_persistent_memoization(\%check_cherry_pick_cache, >- "$cache_path/check_cherry_pick"); >- memoize 'check_cherry_pick', >+ tie_for_persistent_memoization(\%check_cherry_pick2_cache, >+ "$cache_path/check_cherry_pick2"); >+ memoize 'check_cherry_pick2', > SCALAR_CACHE => 'FAULT', >- LIST_CACHE => ['HASH' => \%check_cherry_pick_cache], >+ LIST_CACHE => ['HASH' => \%check_cherry_pick2_cache], > ; > > tie_for_persistent_memoization(\%has_no_changes_cache, >@@ -1636,7 +1637,7 @@ sub tie_for_persistent_memoization { > $memoized = 0; > > Memoize::unmemoize 'lookup_svn_merge'; >- Memoize::unmemoize 'check_cherry_pick'; >+ Memoize::unmemoize 'check_cherry_pick2'; > Memoize::unmemoize 'has_no_changes'; > Memoize::unmemoize '_rev_list'; > } >@@ -1648,7 +1649,8 @@ sub tie_for_persistent_memoization { > return unless -d $cache_path; > > for my $cache_file (("$cache_path/lookup_svn_merge", >- "$cache_path/check_cherry_pick", >+ "$cache_path/check_cherry_pick", # old >+ "$cache_path/check_cherry_pick2", > "$cache_path/has_no_changes")) { > for my $suffix (qw(yaml db)) { > my $file = "$cache_file.$suffix"; >@@ -1817,15 +1819,15 @@ sub find_extra_svn_parents { > } > > # double check that there are no missing non-merge commits >- my (@incomplete) = check_cherry_pick( >+ my ($ninc, $ifirst) = check_cherry_pick2( > $merge_base, $merge_tip, > $parents, > @all_ranges, > ); > >- if ( @incomplete ) { >- warn "W:svn cherry-pick ignored ($spec) - missing " >- .@incomplete." commit(s) (eg $incomplete[0])\n"; >+ if ($ninc) { >+ warn "W:svn cherry-pick ignored ($spec) - missing " . >+ "$ninc commit(s) (eg $ifirst)\n"; > } else { > warn > "Found merge parent ($spec): ", >-- >EW -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html