Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 29 Aug 2016, at 20:05, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> larsxschneider@xxxxxxxxx writes:
> 
>> From: Lars Schneider <larsxschneider@xxxxxxxxx>
>> 
>> Consider the case of a file that requires filtering and is present in
>> branch A but not in branch B. If A is the current HEAD and we checkout B
>> then the following happens:
>> 
>> 1. ce_compare_data() opens the file
>> 2.   index_fd() detects that the file requires to run a clean filter and
>>     calls index_stream_convert_blob()
>> 4.     index_stream_convert_blob() calls convert_to_git_filter_fd()
>> 5.       convert_to_git_filter_fd() calls apply_filter() which creates a
>>         new long running filter process (in case it is the first file
>>         of this kind to be filtered)
>> 6.       The new filter process inherits all file handles. This is the
>>         default on Linux/OSX and is explicitly defined in the
>>         `CreateProcessW` call in `mingw.c` on Windows.
>> 7. ce_compare_data() closes the file
>> 8. Git unlinks the file as it is not present in B
>> 
>> The unlink operation does not work on Windows because the filter process
>> has still an open handle to the file. Apparently that is no problem on
>> Linux/OSX. Probably because "[...] the two file descriptors share open
>> file status flags" (see fork(2)).
> 
> Wait, a, minute.  "that is no problem" may be true as long as "that"
> is "unlinking the now-gone file in the filesystem", but the reason
> does not have anything to do with the "open-file status flags";
> unlike Windows, you _can_ unlink file that has an open file
> descriptor on it.

I see. Thanks for the explanation.

> 
> And even on POSIX systems, if you are doing a long-running helper
> any open file descriptor in the parent process when the long-running
> helper is spawned will become leaked fd.  CLOEXEC is a possible
> solution (but not necessarily the only or the best one) to the fd
> leak in this case.
> 
> How much does the code that spawns these long-running helpers know
> about the file descriptors that happen to be open?

Nothing really.

>  The parent is
> very likely to have pack windows open into .pack files and they need
> to be closed on the child side after fork(2) starts the child
> process but before execve(2) runs the helper, if we want to avoid
> file descriptor leaks.

I think I understand what you are saying. However, during my tests
.pack file fd's were never a problem. I use start_command() [1]
which wraps the fork() and exec calls [2].

How would I find the open .pack file fd's? Should I go through 
/proc/PID/fd? Why is this no problem for other longer running 
commands such as the git-credential-cache--daemon or git-daemon?

Thanks,
Lars


[1] https://github.com/larsxschneider/git/blob/protocol-filter/v6/convert.c#L566
[2] https://github.com/larsxschneider/git/blob/protocol-filter/v6/run-command.c#L345-L412






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]