RE: Best way to merge two repos with same content, differenthistory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: git-owner@xxxxxxxxxxxxxxx [mailto:git-owner@xxxxxxxxxxxxxxx] On
> Behalf Of Robin H. Johnson
> Sent: Friday, June 05, 2009 3:02 PM
> To: Git Mailing List
> Subject: Re: Best way to merge two repos with same content,
> differenthistory
> 
> On Fri, Jun 05, 2009 at 02:06:25PM -0500, Kelly F. Hickel wrote:
> > Robin,
> > 	That's all good news, I have an 8 way box with 32gb of ram
> running a
> > 64 bit Linux, a box with 4 gb of ram panics during the conversion.
> Thanks for your data.
> 
> For comparison, our conversion box is also 8-way, but only 16GiB RAM.
> 
> I'm surprised at how long pass1 is for you, especially since you've
got
> a lot less CVS Files and CVS Revisions than the Gentoo repo (I do
> deduce
> that your individual revisions are larger, averaging at 15KiB vs. our
> 711 bytes).
> 
> I think there's something odd in the total CVS branches/tags count
> however, as the counts there imply an average of 67 branches and 173
> tags per CVS revision. You might want to dig into that part manually
> and
> see about it (not sure of your Python skills). That would probably cut
> down both your pass1 and pass4 times significantly.

Robin, I'm not much with python, so haven't dug into the code much at
all. The numbers are high, although we do create a lot of branches (had
to contribute a fix a year or two to CVS to get the branching time down
from the 2.5 hours it was taking).  At one point I carefully examined
the symbol file that cvs2git was outputting and convinced myself that it
was doing the right thing, but that was awhile ago.

> 
> Hopefully mhagger will get the external blob stuff committed soon, I
> was
> working on validating it's results.
> 
> In doing so discovered a testcase where RCSRevisionReader and
> CVSRevisionReader gave different output themselves, the latter (which
> is
> documented as more accurate otherwise) missing the contents of an
> entire
> file. It's on the cvs2svn-dev mailing list now. Tracing that first,
> thereafter comparing it to the new Git side.
> 
> > git repack -a -d -f --depth=4000 --window=4000 && git pack-refs
--all
> Did those extreme depth/window values actually help size much? The
> Gentoo ones actually didn't improve significantly over
depth=window=50.

I know that they were still (apparently) improving after the 200 mark,
it took long enough at 200 that I just decided to crank the numbers way
up and let it run over the weekend.

> 
> --
> Robin Hugh Johnson
> Gentoo Linux Developer & Infra Guy
> E-Mail     : robbat2@xxxxxxxxxx
> GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

I'll be looking forward to a newer faster cvs2git, although I did just
get the graft idea working, so not sure if we'll wait that long or not
(would be nice not to have to muck around with it though).

Thanks,
Kelly
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]