Re: Import/Export as a fast way to purge files from Git?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 23, 2018, at 4:55 PM, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote:
> 
> On Sun, Sep 23, 2018 at 9:05 AM Lars Schneider <larsxschneider@xxxxxxxxx> wrote:
>> I recently had to purge files from large Git repos (many files, many commits).
>> The usual recommendation is to use `git filter-branch --index-filter` to purge
>> files. However, this is *very* slow for large repos (e.g. it takes 45min to
>> remove the `builtin` directory from git core). I realized that I can remove
>> files *way* faster by exporting the repo, removing the file references,
>> and then importing the repo (see Perl script below, it takes ~30sec to remove
>> the `builtin` directory from git core). Do you see any problem with this
>> approach?
> 
> A couple comments:
> 
> For purging files from a history, take a look at BFG[1] which bills
> itself as "a simpler, faster alternative to git-filter-branch for
> cleansing bad data out of your Git repository history".

Yes, BFG is great. Unfortunately, it requires Java which is not available
on every system I have to work with. I required a solution that would work
in every Git environment. Hence the Perl script :-)


> The approach of exporting to a fast-import stream, modifying the
> stream, and re-importing is quite reasonable.

Thanks for the confirmation!


> However, rather than
> re-inventing, take a look at reposurgeon[2], which allows you to do
> major surgery on fast-import streams. Not only can it purge files from
> a repository, but it can slice, dice, puree, and saute pretty much any
> attribute of a repository.

Wow. Reposurgeon looks very interesting. Thanks a lot for the pointer!

Cheers,
Lars


> [1]: https://rtyley.github.io/bfg-repo-cleaner/
> [2]: http://www.catb.org/esr/reposurgeon/





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux