Re: renormalize histroy with smudge/clean-filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Phillip,

On Sat, Feb 8, 2025 at 3:15 AM Phillip Wood <phillip.wood123@xxxxxxxxx> wrote:
>
> Hi Elijah and Josef
>
> On 08/02/2025 00:23, Elijah Newren wrote:
> > On Fri, Feb 7, 2025 at 12:34 PM Josef Wolf <jw@xxxxxxxxxxxxx> wrote:
> >> On Fri, Feb 07, 2025 at 06:01:43AM -0800, Elijah Newren wrote:
> >>> On Fri, Feb 7, 2025 at 3:13 AM Chris Torek <chris.torek@xxxxxxxxx> wrote:
> >>
> > I also see I didn't look closely enough at Phillip's
> > suggestion, which was:
> >
> >     git rebase --root -x 'git add --renormalize . && { git diff --quiet
> > --cached || git commit --amend --no-edit; }'
> >
> > which will work if you do a lot of manual work to resolve line ending
> > difference conflicts.  Since the git add at each step will modify the
> > files on which the next commit is based, that causes the application
> > of the subsequent commit to conflict,
>
> Indeed, I'd missed that (like you I've not actually used any
> smudge/clean filters)
>
> > and you probably will have
> > difficulty seeing those conflicts since they tend to just be line
> > ending differences.  But, mixing that with Brian's suggestion, you
> > get:
> >
> >    git rebase --root -X renormalize -x 'git add --renormalize . && {
> > git diff --quiet --cached || git commit --amend --no-edit; }'
> >
> > which should probably work if you have a linear history
>
> I've tried that out with a small modification in the script below which
> seems to work. The modification is to add "--attr-source=$(git rev-parse
> HEAD)" between "git" and "rebase" so that git always has a
> .gitattributes file to read when rebasing commits that were made before
> that file was added.

Ooh, nice catch.  If folks had an appropriate .gitattributes file in
place in older versions of history, they probably wouldn't have gotten
into the mess.

> I wonder if we should add something about
> renormalizing a repository to the FAQ based on your footnote.

and perhaps your helpful example?  (although it does assume linear history)  :-)

>  > [1] The renormalize option to the merge machinery ensures that new
>  > blobs produced by the merge have normalized content, and avoid
>  > conflicts when the only differences between files are normalization
>  > ones.  This option does not ensure that new trees only reference new
>  > content nor that they only reference normalized content; _any_
>  > pre-existing blobs in the repository are fair game for new trees to
>  > reference.  As per the manual: "renormalize...This runs a virtual
>  > check-out and check-in of all three stages of a file when resolving a
>  > three-way merge..."  So, the existing behavior of the renormalize
>  > option to rebase/cherry-pick/merge is correct.  It may not be what you
>  > want, but I don't think cherry-picking/rebasing/merging with the
>  > renormalize option is the right tool for this job.
>  >
>
> Best Wishes
>
> Phillip
>
> --- >8 ---
> #!/bin/sh
> set -e
> d="$(mktemp -d)"
> cd "$d"
> git init
> echo "The   quick  brown" >file
> git add file
> git commit -m line-1
> echo "fox  jumps    over" >>file
> git commit -a -m line-2
> echo "the      lazy   dog" >>file
> git commit -a -m line-3
> echo "file filter=space" >.gitattributes
> git config filter.space.clean "sed -e 's/  */ /g'"
> git config filter.space.smudge cat
> git add .gitattributes
> git commit -a -m 'add .gitattributes'
> git reset --hard HEAD
> git --attr-source=$(git rev-parse HEAD) rebase --root -X renormalize \
>      -x 'git add --renormalize . && { git diff --cached --quiet || git
> commit --amend --no-edit; }'

So, I'm slightly surprised here.  Does the --attr-source specified to
the outer git become an environment variable or something for the
inner git-add invocation?  How does the git add subprocess know about
it?

...<does some searches ending with>...

$ git grep -5 GIT_ATTR_SOURCE -- git.c
git.c-          } else if (!strcmp(cmd, "--attr-source")) {
git.c-                  if (*argc < 2) {
git.c-                          fprintf(stderr, _("no attribute source
given for --attr-source\n" ));
git.c-                          usage(git_usage_string);
git.c-                  }
git.c:                  setenv(GIT_ATTR_SOURCE_ENVIRONMENT, (*argv)[1], 1);
git.c-                  if (envchanged)
git.c-                          *envchanged = 1;
git.c-                  (*argv)++;
git.c-                  (*argc)--;
git.c-          } else if (skip_prefix(cmd, "--attr-source=", &cmd)) {
git.c-                  set_git_attr_source(cmd);
git.c:                  setenv(GIT_ATTR_SOURCE_ENVIRONMENT, cmd, 1);
git.c-                  if (envchanged)
git.c-                          *envchanged = 1;
git.c-          } else if (!strcmp(cmd, "--no-advice")) {
git.c-                  setenv(GIT_ADVICE_ENVIRONMENT, "0", 1);
git.c-                  if (envchanged)

ahah, so it is passed via environment variable to the subprocess.

Anyway, nice catch.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux