Re: git filter-branch --subdirectory-filter not working as expected, history of other folders is preserved

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 10, 2016 at 7:19 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Mon, Oct 10, 2016 at 05:12:25PM +0100, Seaders Oloinsigh wrote:
>
>> Due to the structure of this repo, it looks like there are some
>> branches that never had anything to do with the android/ subdirectory,
>> so they're not getting wiped out.  My branch is in a better state to
>> how I want it, but still, if I run your suggestion,
>> [...]
>
> Hmm. Yeah, I think this is an artifact of the way that filter-branch
> works with pathspec limiting. It keeps a mapping of commits that it has
> rewritten (including ones that were rewritten only because their
> ancestors were), and realizes that a branch ref needs updated when the
> commit it points to was rewritten.
>
> But if we don't touch _any_ commits in the history reachable from a
> branch (because they didn't even show up in our pathspec-limited
> rev-list), then it doesn't realize we touched the branch's history at
> all.
>
> I agree that the right outcome is for it to delete those branches
> entirely. I suspect the fix would be pretty tricky, though.
>
> In the meantime, I think you can work around it by either:
>
>   1. Make a pass beforehand for refs that do not touch your desired
>      paths at all, like:
>
>        path=android ;# or whatever
>        git for-each-ref --format='%(refname)' |
>        while read ref; do
>          if test "$(git rev-list --count "$ref" -- "$path")" = 0; then
>            echo "delete $ref"
>          fi
>        done |
>        git update-ref --stdin
>
>      and then filter what's left:
>
>        git filter-branch --subdirectory-filter $path -- --all

This is the perfect solution for me.  Going through the delete
branches runthrough also quickened the filter-branch command, and I'm
left with a much more complete version of where I want to be.

I would still contend that the filter-branch either doesn't work as
expected, or the docs need updating to provide extra steps like you've
done, because when dealing with a large repo like we have, running
multiple filter-branch commands, trying different combinations is
quite a time sync, when you're left with the same incorrect solution
again and again.

>
> or
>
>   2. Do the filter-branch, and because you know you specified --all and
>      that your filters would touch all histories, any ref which _wasn't_
>      touched can be deleted. That list is anything which didn't get a
>      backup entry in refs/original. So something like:
>
>        git for-each-ref --format='%(refname)' |
>        perl -lne 'print $1 if m{^refs/original/(.*)}' >backups
>
>        git for-each-ref --format='%(refname)' |
>        grep -v ^refs/original >refs
>
>        comm -23 refs backups |
>        sed "s/^/delete /" |
>        git update-ref --stdin
>
> -Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]