Hi Dscho, On Fri, Aug 30, 2019 at 1:40 PM Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > > Hi Elijah, > > > On Wed, 28 Aug 2019, Elijah Newren wrote: > > > Hi Sergey, > > > > On Wed, Aug 28, 2019 at 1:52 AM Sergey Organov <sorganov@xxxxxxxxx> wrote: > > > > > > Elijah Newren <newren@xxxxxxxxx> writes: > > > > > > > On Tue, Aug 27, 2019 at 1:43 AM Sergey Organov <sorganov@xxxxxxxxx> wrote: > > > >> > > > >> Eric Wong <e@xxxxxxxxx> writes: > > > >> > > > >> > > > >> [...] > > > >> > > > >> > AFAIK, filter-branch is not causing support headaches for any > > > >> > git developers today. With so many commands in git, it's > > > >> > unlikely newbies will ever get around to discover it :) > > > >> > So I think think we should be in any rush to remove it. > > > >> > > > >> Nah, discovering it is simple. Just Google for "git change author". That > > > >> eventually leads to a script that uses "git filter-branch --env-filter" > > > >> to get the job done, and I'm afraid it is spread all over the world. > > > >> > > > >> See, e.g.: > > > >> > > > >> https://help.github.com/en/articles/changing-author-info > > > > > > > > Side note: Is the goal to "fix names and email addresses in this > > > > repository"? If so, this guide fails: it doesn't update tagger names > > > > or email addresses. Indeed, filter-branch doesn't provide a way to do > > > > that. (Not to mention other problems like not updating references to > > > > commit hashes in commit messages when it busy rewriting everything.) > > > > > > No. Maybe the original goal was like that, by I, personally, use > > > modified version of this to change my "Author" credentials from > > > "internal" to "public" in branches that I'm going to send upstream, so > > > the actual aim is to change e-mail of particular Author from a@b to c@d > > > in all the commits in a (feature) branch. > > > > There's an interesting usecase I hadn't heard of or thought of before. > > I'll throw in another use case that's kinda related: extracting the > history of one file (or subdirectory). Thanks for sending these along! I do have some comments, and a bunch of questions... > In my most recent instance of this, I wanted to publish the script I > used to use for submitting patch series to the Git mailing list, > maintaining tags for iterations and generating cover letters from branch > descriptions and interdiffs (this script eventually became GitGitGadget, > https://github.com/gitgitgadget/gitgitgadget/commits?after=6fb0ede48f86e729292ee1542729bc0f5a30cfa6+0 > demonstrates this). > > To do that, I ran a `git filter-branch` in the repository where I track > all the scripts I deem unsuitable for public consumption, to remove all > files but `mail-patch-series.sh`, then pushed it to > https://github.com/dscho/mail-patch-series > > Please note that most crucially, I wanted to rewrite a newly-created > branch, and only that branch. > > Could I have done the same using `git fast-export`, filtering the output > with a Perl script, then passing it to `git fast-import`? Sure, I was > really tempted to do that. In the end, it took less of _my_ time to just > let `git filter-branch` do its work with a not-too-complicated index > filter. Why a perl script? Shouldn't git fast-export [--no-data] HEAD -- $PATH | git fast-import --force --quiet do the trick? And it's probably simpler and shorter than the index filter you used. That said, yeah it'd be nice to get automatic rewriting of commit hashes in commit messages and other niceties from filter-repo (e.g. future automatic reattaching of notes to the rewritten commits). Some questions: * What's the backup strategy in case you specify the wrong filters (e.g. you have a typo in the pathnames)? filter-repo encourages folks to make a clone and then filter the fresh clone, because if anything goes awry, you can just delete and restart. (I am heavily opposed to the refs/original/ backup mechanism used by filter-branch, for multiple reasons.) Is your safety stance just "If I mess up it's my own fault; do the rewrite?" Or are you okay with cloning before filtering? * If you're okay with cloning before filtering...then is there an issue with rewriting all branches, and just pushing the one you need? (Is there an issue with "this branch is small, the others are huge, and filter-branch is slow -- so rewriting one branch saves me lots of time"? Or are there other issues at play too?) * What if the user has auxiliary information for the branch in other refs? For example, git-notes pointing at any of the commits, or tags in the history of the branch that might be relevant, or perhaps even replace refs in combination with GIT_NO_REPLACE_OBJECTS=1? Is this an "I don't care, toss that stuff and just rewrite just this branch?" * filter-repo by default creates new replace references so that you can refer to new commit IDs using old (unabbreviated) commit IDs. Would that be considered helpful for this usecase? unhelpful? irrelevant, since you'll just push the branch you want somewhere and nuke the temporary clone? I'm not by any means ruling out the possibility of documenting --refs and adjusting the defaults when it is used so the user can just run something like git filter-repo --path $PATH --refs $MYBRANCH but I feel like I need to understand answers to questions like the above ones so that I can know how to phrase warnings and adjust defaults and update the documentation. > In another instance, a long, long time ago, I needed to restart a > repository which had included way too many files for its own good, then > rename the old repository and start with a fresh `master` that contained > but a single commit whose tree was identical to the previous `master`'s > tip commit. I simply grafted that commit, ran `git filter-branch` and > had precisely what I needed. filter-repo supports grafts and replace objects, the same as filter-branch. (Although, technically, I didn't have to do a thing to support it; fast-export does the special handling of rewriting based on grafts and replace objects.) So, I'd say this is fully supported. Side question: the git-replace documents suggest that the graft file is deprecated. Are there any timeframes or plans for phasing out beyond the git-replace manpage existing? Should I avoid documenting the graft file support in filter-repo? Should I include examples using not just git-replace but also using the graft file? > I would be _delighted_ if these kinds of use case (rewriting a branch, > or even just a commit range) became more of a first-class citizen with > `git filter-repo`. I've got all the pieces for supporting a single branch or a commit range (e.g. 'git filter-repo --path foo --refs ^master~4 ^stable~23 mybranch'), but the defaults (error out unless in a bare repo, move refs/remotes/origin/* to refs/heads/*, disconnect origin remote, expire reflogs & repack & prune, create new replace references so folks can access new commits using old commit IDs) may be somewhat friction-filled for this usecase. Those defaults other than the new replace refs happen to all be turned off with the combination of --force and --target, so, assuming turning them off is what you need, you could cheat and just specify 'git filter-repo --force --target . --refs $MYBRANCH' today and perhaps get what you want, but that's a really non-intuitive command line that is way too ugly to recommend. And I don't want to tie myself to '--target .' being the magic sauce in the future either.