On Thu, Mar 28, 2019 at 7:10 PM Thomas Gummerer <t.gummerer@xxxxxxxxx> wrote: > > On 03/22, Matheus Tavares wrote: > > There is currently an odd behaviour when locally clonning a repository > > with symlinks at .git/objects: using --no-hardlinks all symlinks are > > dereferenced but without it Git will try to hardlink the files with the > > link() function, which has an OS-specific behaviour on symlinks. On OSX > > and NetBSD, it creates a hardlink to the file pointed by the symlink > > whilst on GNU/Linux, it creates a hardlink to the symlink itself. > > > > On Manjaro GNU/Linux: > > $ touch a > > $ ln -s a b > > $ link b c > > $ ls -li a b c > > 155 [...] a > > 156 [...] b -> a > > 156 [...] c -> a > > > > But on NetBSD: > > $ ls -li a b c > > 2609160 [...] a > > 2609164 [...] b -> a > > 2609160 [...] c > > > > It's not good to have the result of a local clone to be OS-dependent and > > since the behaviour on GNU/Linux may result in broken symlinks, let's > > re-implement it with linkat() instead of link() using a flag to always > > follow symlinks and make the hardlink be to the pointed file. With this, > > besides standardizing the behaviour, no broken symlinks will be > > produced. Also, add tests for symlinked files at .git/objects/. > > > > Note: Git won't create symlinks at .git/objects itself, but it's better > > to handle this case and be friendly with users who manually create them. > > > > Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > > Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > > --- > > builtin/clone.c | 2 +- > > t/t5604-clone-reference.sh | 26 +++++++++++++++++++------- > > 2 files changed, 20 insertions(+), 8 deletions(-) > > > > diff --git a/builtin/clone.c b/builtin/clone.c > > index 50bde99618..b76f33c635 100644 > > --- a/builtin/clone.c > > +++ b/builtin/clone.c > > @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > > if (unlink(dest->buf) && errno != ENOENT) > > die_errno(_("failed to unlink '%s'"), dest->buf); > > if (!option_no_hardlinks) { > > - if (!link(src->buf, dest->buf)) > > + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) > > This line is starting to get a bit long, might be worth breaking it up > to keep to 80 characters per line. > > I notice that we are currently not using 'linkat()' anywhere else in > our codebase. It looks like it has been introduced in POSIX.1-2008, > which sounds fairly recent by git's standards. So I wonder if this is > really supported on all platforms that git is being built on. > > I also wonder what would need to be done on Windows if we were to > introduce this. I see we define the 'link()' function in > 'compat/mingw.c' for that currently, so I guess something similar > would be needed for 'linkat()'. I added Dscho to Cc for Windows > expertise. Ok, what if instead of using linkat() we use 'realpath(const char *path, char *resolved_path)', which will resolve any symlinks at 'path' and store the canonical path at 'resolved_path'? Then, we can still keep using link() but now, with the certainty that all platforms will have a consistent behaviour? (also, realpath() is POSIX.1-2001) Would that be a better idea? > While I agree with the goal of consistency accross all platforms here, > I don't know if it's actually worth going through the pain of doing > that, especially for somewhat of an edge case in local clones. > > If the test in the previous patch passes on all platforms, I'd be okay > with just calling the behaviour here undefined, especially as git > would never actually create symlinks in the .git/objects directory.