Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 27, 2021 at 11:46:40PM -0700, Elijah Newren wrote:

> > * en/remerge-diff (2021-08-31) 7 commits
> >  - doc/diff-options: explain the new --remerge-diff option
> >  - show, log: provide a --remerge-diff capability
> >  - tmp-objdir: new API for creating and removing primary object dirs
> >  - merge-ort: capture and print ll-merge warnings in our preferred fashion
> >  - ll-merge: add API for capturing warnings in a strbuf instead of stderr
> >  - merge-ort: add ability to record conflict messages in a file
> >  - merge-ort: mark a few more conflict messages as omittable
> >
> >  A new presentation for two-parent merge "--remerge-diff" can be
> >  used to show the difference between mechanical (and possibly
> >  conflicted) merge results and the recorded resolution.
> >
> >  Will merge to 'next'?
> 
> It has been a month that it's been cooking with no issues brought up,
> and it's been in production for nearly a year...
> 
> But just this morning I pinged peff and jrnieder if they might have
> time to respectively look at the tmp-objdir stuff (patch 5, plus its
> integration into log-tree.c in patch 7) and the ll-merge.[ch] changes
> (patch 3).  I don't know if either will have time to do it, but
> perhaps wait half a week or so to see if they'll mention they have
> time?  Otherwise, yeah, it's probably time to merge this down.

Sorry to take so long. I think this is a very exciting topic, and I
appreciate being called into to look at tmp-objdir stuff, because it can
be quite subtle.

I just left a rather long-ish mail in the thread, but the gist of it is
that I'm worried that there's some possibility of corruption there if
the diff code writes objects. I didn't do a proof-of-concept there, but
I worked one up just now. Try this:

  # make a repo
  git init repo
  cd repo
  echo base >file
  git add file
  git commit -m base

  # two diverging branches
  echo main >file
  git commit -am main
  git checkout -b side HEAD^
  echo side >file
  git commit -am side

  # we get a conflict, and we resolve
  git merge main
  echo resolved >file
  git commit -am merged

  # now imagine we had a file with a diff driver. I stuffed it
  # in here after the fact, but it could have been here all along,
  # or come as part of the merge, etc.
  echo whatever >unrelated
  echo "unrelated diff=foo" >.gitattributes
  git add .
  git commit --amend --no-edit

  # set up the diff driver not just to do a textconv, but to cache the
  # result. This will require writing out new objects for the cache
  # as part of the diff operation.
  git config diff.foo.textconv "$PWD/upcase"
  git config diff.foo.cachetextconv true
  cat >upcase <<\EOF &&
  #!/bin/sh
  tr a-z A-Z <$1
  EOF
  chmod +x upcase

  # now show the diff
  git log -1 --remerge-diff

  # and make sure the repo is still OK
  git fsck

The remerge diff will correctly show the textconv'd content (because
it's not in either parent, and hence an evil merge):

  diff --git a/unrelated b/unrelated
  new file mode 100644
  index 0000000..982793c
  --- /dev/null
  +++ b/unrelated
  @@ -0,0 +1 @@
  +WHATEVER

but then fsck shows that our cache is corrupted:

  Checking object directories: 100% (256/256), done.
  error: refs/notes/textconv/foo: invalid sha1 pointer 1e9b3ecffca73411001043fe914a7ec0e7291043
  error: refs/notes/textconv/foo: invalid reflog entry 1e9b3ecffca73411001043fe914a7ec0e7291043
  dangling tree 569b6e414203d2f50bdaafc1585eae11e28afc37

Now I'll admit the textconv-cache is a pretty seldom-used feature. And
that there _probably_ aren't a lot of other ways that the diff code
would try to write objects or references. But I think it speaks to the
kind of subtle interactions we should worry about (and certainly
corrupting the repository is a pretty bad outcome, though at least in
this case it's "just" a cache and we could blow away that ref manually).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux