On Sat, Feb 23 2019, Matheus Tavares wrote: > Replace usage of opendir/readdir/closedir API to traverse directories > recursively, at copy_or_link_directory function, by the dir-iterator > API. This simplifies the code and avoid recursive calls to > copy_or_link_directory. Sounds good in principle. > This process also brings some safe behaviour changes to > copy_or_link_directory: I ad-hoc tested some of these, and could spot behavior changes. We should have tests for these. > - It will no longer follows symbolic links. This is not a problem, > since the function is only used to copy .git/objects directory, and > symbolic links are not expected there. I don't think we should make that assumption, and I don't know of anything else in git that does. I've certainly symlinked individual objects or packs into a repo for debugging / recovery, and it would be unexpected to clone that and miss something. So in the general case we should be strict in what we generate, but permissive in what we accept. We don't want a "clone" of an existing repo to fail, or "fsck" to fail after clone... When trying to test this I made e.g. objects/c4 a symlink to /tmp/c4, and a specific object in objects/4d/ a symlink to /tmp too. Without this patch the individual object is still a symlink, but the object under the directory gets resolved, and "un-symlinked", also with --dissociate, which seems like an existing bug. With your patch that symlink structure is copied as-is. That's more faithful under --local, but a regression for --dissociate (which didn't work fully to begin with...). I was paranoid that "no longer follows symbolic links" could also mean "will ignore those objects", but it seems to more faithfully copy things as-is for *that* case. But then I try with --no-hardlinks and stock git dereferences my symlink structure, but with your patches fails completely: Cloning into bare repository 'repo2'... error: copy-fd: read returned: Is a directory fatal: failed to copy file to 'repo2/objects/c4': Is a directory fatal: the remote end hung up unexpectedly fatal: cannot change to 'repo2': No such file or directory So there's at least one case in a few minutes of prodding this where we can't clone a working repo now, however obscure the setup. > - Hidden directories won't be skipped anymore. In fact, it is odd that > the function currently skip hidden directories but not hidden files. > The reason for that could be unintentional: probably the intention > was to skip '.' and '..' only, but it ended up accidentally skipping > all directories starting with '.'. Again, it must not be a problem > not to skip hidden dirs since hidden dirs/files are not expected at > .git/objects. I reproduce this with --local. A ".foo" isn't copied before, now it is. Good, I guess. We'd have already copied a "foo". > - Now, copy_or_link_directory will call die() in case of an error on > openddir, readdir or lstat, inside dir_iterator_advance. That means > it will abort in case of an error trying to fetch any iteration > entry. Good, but really IMNSHO this series is tweaking some critical core code and desperately needs tests. Unfortunately, in this as in so many edge case we have no existing tests. This would be much easier to review and would give reviewers more confidence if the parts of this that changed behavior started with a patch or patches that just manually objects/ dirs with various combinations of symlinks, hardlinks etc., and asserted that the various options did exactly what they're doing now, and made sure the source/target repos were the same after/both passed "fsck". Then followed by some version of this patch which changes the behavior, and would be forced to tweak those tests. To make it clear e.g. that some cases where we have a working "clone" are now a hard error. Thanks for working on this!