Re: git rebase chokes on directory -> symlink -> directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"H. Peter Anvin" <hpa@xxxxxxxxx> writes:

> Applying 'Revert "x86-64: Make arch/x86-64/boot a symlink to
> arch/i386/boot"'
>
> Adds trailing whitespace.
> .dotest/patch:117:FDARGS =
> Adds trailing whitespace.
> .dotest/patch:350: * Page 0 is deliberately kept safe, since System
> Management Mode code in
> Adds trailing whitespace.
> .dotest/patch:352: * useful for future device drivers that either access
> the BIOS via VM86
> Adds trailing whitespace.
> .dotest/patch:648: *
> Adds trailing whitespace.
> .dotest/patch:649: * This is a collection of several routines from
> gzip-1.0.3
> error: arch/x86_64/boot/.gitignore: already exists in working directory
> error: arch/x86_64/boot/Makefile: already exists in working directory
> error: arch/x86_64/boot/compressed/Makefile: already exists in working
> directory
> error: arch/x86_64/boot/compressed/head.S: already exists in working
> directory
> error: arch/x86_64/boot/compressed/misc.c: already exists in working
> directory
> error: arch/x86_64/boot/compressed/vmlinux.lds: already exists in
> working directory
> error: arch/x86_64/boot/compressed/vmlinux.scr: already exists in
> working directory
> error: arch/x86_64/boot/install.sh: already exists in working directory
> error: arch/x86_64/boot/mtools.conf.in: already exists in working directory
> error: arch/x86_64/boot/tools/.gitignore: already exists in working
> directory
> error: arch/x86_64/boot/tools/build.c: already exists in working directory

Ahh.

 * The tree state before this patch is applied has arch/x86_64/boot
   as a symlink pointing at ../i386/boot/

 * The patch tries to remove arch/x86_64/boot symlink, and
   create bunch of files there: .gitignore, Makefile, etc.

This is unfortunately a bit deeper than just git-rebase.  You
exposed a nasty corner case problem in the stock git-apply,
probably one of the most important tool the kernel project uses
every day.

git-apply tries to be careful while applying patches; it never
touches the working tree until it is convinced that the patch
would apply cleanly.  One of the check it does is that when it
knows a path is going to be created by the patch (in your
example, arch/x86_64/boot/.gitignore), it runs lstat() on the
path to make sure it does not exist.

This leads to a false alarm.  Because we do not touch the
working tree before all the check passes, when we try to make
sure that arch/x86_64/boot/.gitignore does not exist yet, we
haven't removed the arch/x86_64/boot symlink.  The lstat() check
ends up seeing arch/i386/boot/.gitignore through the
yet-to-be-removed symlink, and says "Hey, you already have a
file there, but what you fed me is a patch to create a new
file. I am not going to clobber what you have in the working
tree."

We have similar checks to see a file we are going to modify does
exist and match the preimage of the diff, which is done by
directly opening and reading the file.

For a file we are going to delete, we make sure that it does
exist and matches what is going to be removed (a removal patch
records the full preimage, so we check what you have in your
working tree matches it in full -- otherwise we would risk
losing your local changes), which again is done by directly
opening and reading the file.

These checks need to be adjusted so that they are not fooled by
symlinks in the middle.

 - To make sure something does not exist, first lstat().  If it
   does not exist, it does not, so be happy.  If it _does_, we
   might be getting fooled by a symlink in the middle, so break
   leading paths and see if there are symlinks involved.  When
   we are checking for a path a/b/c/d, if any of a, a/b, a/b/c
   is a symlink, then a/b/c/d does _NOT_ exist, for the purpose
   of our test.

   This would fix this particular case you saw, and would not
   add extra overhead in the usual case.

 - To make sure something already exists, first lstat().  If it
   does not exist, barf (up to this, we already do).  Even if it
   does seem to exist, we might be getting fooled by a symlink
   in the middle, so make sure leading paths are not symlinks.

   This would make the normal codepath much more expensive for
   deep trees, which is a bit worrisome.

When the code writes out the result after the checks pass, we
first delete all the paths that are going to be deleted, and
also delete all the paths that are being modified.  Then in the
second pass, we create all the paths that are being created and
also modified by writing their final contents out, while
creating leading directories as needed.  Because of this, a
patch that removes a symlink and then makes it a directory to
hold new files would first remove the symlink and then create
the directory at the right location and deposit newly created
files there.  So I do not expect any change is needed in the
writeout codepath.

By the way, I noticed a few things while diagnosing this.

 * It was very considerate of you to leave "rebase-1" branch for
   my postmortem in the repository.  You know what are needed
   for debugging very well.

 * git-rebase with -m is dog slow.  There were people who
   advocated to make it the default, but they probably are
   either working in a very small project, or working on a
   filesystem that even git-apply is slow that the speed
   difference does not matter to them.


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux