On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote: > [is this the right place to ask questions about git usage? or is > there a different forum where one can submit possibly embarrassingly > silly questions?] No, this is the right place for embarrassing questions. :) > say, early on, one commits a sizable directory of content, call it > /mydir. that directory sits there for a while until it becomes obvious > it's out of date and worthless and should never have been committed. > the obvious solution would seem to be: > > $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD > > correct? That would work, though note that using an --index-filter would be more efficient (since it avoids checking out each tree as it walks the history). > however, say one version of that directory was committed early on, > then later tossed for being useless with "git rm", and subsequently > replaced by newer content under exactly the same name. now i'd like to > go back and delete the history related to that early version of > /mydir, but not the second. Makes sense as a goal. > obviously, i can't use the above command as it would delete both > versions. so it appears the solution would be a trivial application of > the "--commit-filter" option: > > git filter-branch --commit-filter ' > if [ "$GIT_COMMIT" = "<commit-id>" ] ; then > skip_commit "$@"; > else > git commit-tree "$@"; > fi' HEAD > > where <commit-id> is the commit that introduced the first verrsion of > /mydir. do i have that right? is there a simpler way to do this? No, this won't work. Filter-branch is not walking the history and applying the changes to each commit, like rebase does. It's literally operating on each commit object, and recall that each commit object points to a tree that is a snapshot of the repository contents. So if you skip a commit, that commit itself goes away. But the commit after it (which didn't touch the unwanted contents) will still mention those contents in its tree. I think you want to stick with a --tree-filter (or an --index-filter), but just selectively decide when to do the deletion. For example, if you can tell the difference between the two states based on the presence of some file, then perhaps: git filter-branch --prune-empty --index-filter ' if git rev-parse --verify :dir/sentinel >/dev/null 2>&1 then git rm --cached -rf dir fi ' HEAD The "--prune-empty" is optional, but will drop commits that become empty because they _only_ touched that directory. We use ":dir/sentinel" to see if the entry is in the index, because the index filter won't have the tree checked out. Likewise, we need to use "rm --cached" to just touch the index. -Peff