Re: [PATCH v8 00/11] Git filter protocol

Lars Schneider <larsxschneider@xxxxxxxxx> · Mon, 3 Oct 2016 19:35:19 +0200

> On 03 Oct 2016, at 19:02, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> Lars Schneider <larsxschneider@xxxxxxxxx> writes:
> 
>>> If the filter process refuses to die forever when Git told it to
>>> shutdown (by closing the pipe to it, for example), that filter
>>> process is simply buggy.  I think we want users to become aware of
>>> that, instead of Git leaving it behind, which essentially is to
>>> sweep the problem under the rug.
>>> 
>>> I agree with what Peff said elsewhere in the thread; if a filter
>>> process wants to take time to clean things up while letting Git
>>> proceed, it can do its own process management, but I think it is
>>> sensible for Git to wait the filter process it directly spawned.
>> 
>> To realize the approach above I prototyped the run-command patch below:
>> 
>> I added an "exit_timeout" variable to the "child_process" struct.
>> On exit, Git will close the pipe to the process and wait "exit_timeout" 
>> seconds until it kills the child process. If "exit_timeout" is negative
>> then Git will wait until the process is done.
> 
>> If we use that in the long running filter process, then we could make
>> the timeout even configurable. E.g. with "filter.<driver>.process-timeout".
>> 
>> What do you think about this solution? 
> 
> Is such a configuration (or timeout in general) necessary?  I
> suspect that a need for timeout, especially needing timeout and
> needing to get killed that happens so often to require a
> configuration variable, is a sign of something else seriously wrong.
> 
> What's the justification for a filter to _require_ getting killed
> all the time when it is spawned?  Otherwise you wouldn't configure
> "this driver does not die when told, so we need a timeout" variable.
> Is it a sign of the flaw in the protocol to talk to it?  e.g. Git
> has a way to tell it to die, but it somehow is very hard to hear
> from filter's end and honor that request?
> 
> I think that we would need some timeout in the mechanism, but not to
> be used for "killing".
> 
> You would decide to "kill" an filter process in two cases: the
> filter is buggy and refuses to die when Git tells it to exit, or the
> code in Git waiting for its death is somehow miscounting its
> children, and thought it told to die one process but in fact it
> didn't (perhaps it told somebody else to die), or it thought it
> hasn't seen the child die when in fact it already did.

Agreed.

> Calling kill(2) and exiting would hide these two kind of bugs from
> end users.  Not doing so would give the end users a hung Git, which
> is a VERY GOOD thing.  Otherwise you would not notice bugs and lose
> the opportunity to diagnose and fix it.

Aha. I assumed that a hung Git because of a buggy filter would be a no-no.
Thanks for this clarification.

> The timeout would be good for you to give a message "filter process
> running the script '%s' is not exiting; I am waiting for it".  The
> user is still left with a hung Git, and can then see if that process
> is hanging around.  If it is, then we found a buggy filter.  Or we
> found a buggy Git.  Either needs to be fixed.  I do not think it
> would help anybody by doing a kill(2) to sweep possible bugs under
> the rug.

I could achieve that with this run-command patch: 
http://public-inbox.org/git/E9946E9F-6EE5-492B-B122-9078CEB88044@xxxxxxxxx/
(I'll remove the "timeout after x seconds" parts and keep the "wait until 
done" part with stderr output)

Thanks,
Lars