> On 29 Aug 2016, at 20:05, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > larsxschneider@xxxxxxxxx writes: > >> From: Lars Schneider <larsxschneider@xxxxxxxxx> >> >> Consider the case of a file that requires filtering and is present in >> branch A but not in branch B. If A is the current HEAD and we checkout B >> then the following happens: >> >> 1. ce_compare_data() opens the file >> 2. index_fd() detects that the file requires to run a clean filter and >> calls index_stream_convert_blob() >> 4. index_stream_convert_blob() calls convert_to_git_filter_fd() >> 5. convert_to_git_filter_fd() calls apply_filter() which creates a >> new long running filter process (in case it is the first file >> of this kind to be filtered) >> 6. The new filter process inherits all file handles. This is the >> default on Linux/OSX and is explicitly defined in the >> `CreateProcessW` call in `mingw.c` on Windows. >> 7. ce_compare_data() closes the file >> 8. Git unlinks the file as it is not present in B >> >> The unlink operation does not work on Windows because the filter process >> has still an open handle to the file. Apparently that is no problem on >> Linux/OSX. Probably because "[...] the two file descriptors share open >> file status flags" (see fork(2)). > > Wait, a, minute. "that is no problem" may be true as long as "that" > is "unlinking the now-gone file in the filesystem", but the reason > does not have anything to do with the "open-file status flags"; > unlike Windows, you _can_ unlink file that has an open file > descriptor on it. I see. Thanks for the explanation. > > And even on POSIX systems, if you are doing a long-running helper > any open file descriptor in the parent process when the long-running > helper is spawned will become leaked fd. CLOEXEC is a possible > solution (but not necessarily the only or the best one) to the fd > leak in this case. > > How much does the code that spawns these long-running helpers know > about the file descriptors that happen to be open? Nothing really. > The parent is > very likely to have pack windows open into .pack files and they need > to be closed on the child side after fork(2) starts the child > process but before execve(2) runs the helper, if we want to avoid > file descriptor leaks. I think I understand what you are saying. However, during my tests .pack file fd's were never a problem. I use start_command() [1] which wraps the fork() and exec calls [2]. How would I find the open .pack file fd's? Should I go through /proc/PID/fd? Why is this no problem for other longer running commands such as the git-credential-cache--daemon or git-daemon? Thanks, Lars [1] https://github.com/larsxschneider/git/blob/protocol-filter/v6/convert.c#L566 [2] https://github.com/larsxschneider/git/blob/protocol-filter/v6/run-command.c#L345-L412