Re: Import/Export as a fast way to purge files from Git?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Sep 23, 2018 at 9:05 AM Lars Schneider <larsxschneider@xxxxxxxxx> wrote:
> I recently had to purge files from large Git repos (many files, many commits).
> The usual recommendation is to use `git filter-branch --index-filter` to purge
> files. However, this is *very* slow for large repos (e.g. it takes 45min to
> remove the `builtin` directory from git core). I realized that I can remove
> files *way* faster by exporting the repo, removing the file references,
> and then importing the repo (see Perl script below, it takes ~30sec to remove
> the `builtin` directory from git core). Do you see any problem with this
> approach?

A couple comments:

For purging files from a history, take a look at BFG[1] which bills
itself as "a simpler, faster alternative to git-filter-branch for
cleansing bad data out of your Git repository history".

The approach of exporting to a fast-import stream, modifying the
stream, and re-importing is quite reasonable. However, rather than
re-inventing, take a look at reposurgeon[2], which allows you to do
major surgery on fast-import streams. Not only can it purge files from
a repository, but it can slice, dice, puree, and saute pretty much any
attribute of a repository.

[1]: https://rtyley.github.io/bfg-repo-cleaner/
[2]: http://www.catb.org/esr/reposurgeon/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux