Re: Performance issue exposed by git-filter-branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I had considered this approach (and the one mentioned by Jonathan) but there are no git tools to actually perform the filter I wanted on the export in this form.  I could (and will) parse fast-export and make an attempt a filtering files/directories... my concern is that I won't do it right, and will introduce subtle corruption.  But if there's no existing tool, I'll take a crack at it. :-)

Thanks for your suggestions so far,

Ken

PS: This was my exact first thought, since I was previously used to performing "svnadmin dump/svndumpfilter/svnadmin load" on this repository when it was in SVN.

On Dec 16, 2010, at 5:54 PM, Thomas Rast wrote:

> Ken Brownfield wrote:
>> git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch -- bigdirtree stuff/a stuff/b stuff/c stuff/dir/{a,b,c}' --prune-empty --tag-name-filter cat -- --all
> [...]
>> Now that the same repository has grown, this same filter-branch
>> process now takes 6.5 *days* at 100% CPU on the same machine (2x4
>> Xeon, x86_64) on git-1.7.3.2.  There's no I/O, memory, or other
>> resource contention.
> 
> If all you do is an index-filter for deletion, I think it should be
> rather easy to achieve good results by filtering the fast-export
> stream to remove these files, and then piping that back to
> fast-import.
> 
> (It's just that AFAIK nobody has written that code yet.)
> 
> -- 
> Thomas Rast
> trast@{inf,student}.ethz.ch

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]