Re: [PATCH 11/11] test-lib: clear watchman watches at test completion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/9/2019 6:40 PM, SZEDER Gábor wrote:
> On Mon, Dec 09, 2019 at 09:12:37AM -0500, Derrick Stolee wrote:
>>>> +		watchman watch-list |
>>>
>>> Then with the above fixed, trying to run 'watchman' triggers another
>>> error if it's not installed:
>>>
>>>   $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh 
>>>   [...]
>>>   ok 21 - hostname interpolation works after LF-stripping
>>>   ./t5570-git-daemon.sh: 1484: ./t5570-git-daemon.sh: watchman: not found
>>>   # failed 1 among 21 test(s)
>>>
>>> I think we need an additional condition to run this only if
>>> 't7519/fsmonitor-watchman' is used in the tests.
>>
>> The intention is to enable a test-suite-wide run using GIT_TEST_FSMONITOR,
>> and that can only use watchman (currently).
> 
> I've just run 'GIT_TEST_FSMONITOR=$(pwd)/t7519/fsmonitor-all make',
> and it only failed one test in 't0090-cache-tree.sh', but the fix is
> already in 'pu' in 61eea521fe (fsmonitor: do not compare bitmap size
> with size of split index, 2019-11-13).
> 
> 
>>>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>>>> index 30b07e310f..067a432ea5 100644
>>>> --- a/t/test-lib.sh
>>>> +++ b/t/test-lib.sh
>>>> @@ -1072,6 +1072,8 @@ test_atexit_handler () {
>>>>  	# sure that the registered cleanup commands are run only once.
>>>>  	test : != "$test_atexit_cleanup" || return 0
>>>>  
>>>> +	test_clear_watchman
>>>
>>> I'm not sure where to put this call, but this is definitely not the
>>> right place for it.  See that 'return 0' above in the context?  That's
>>> where the test_atexit_handler function returns early when no atexit
>>> handler commands are set, i.e. in all test scripts that don't involve
>>> some kind of daemons, thus this call is not invoked in the majority of
>>> test scripts.
>>
>> Ah, I misunderstood the point of test_atexit_handler.
>>
>>> Simply moving this call before that early return is not good, because
>>> then it would be invoked twice.
>>>
>>> An option would be to register this call as an atexit command
>>> somewhere late in 'test-lib.sh' (around where GIT_TEST_GETTEXT_POISON
>>> is restored, perhaps).  That way it would be invoked most of the time,
>>> and it would be invoked only once, but I'm not sure how it would work
>>> out with test scripts that unset GIT_TEST_FSMONITOR somewhere in the
>>> middle for the remainder of the test script.  However, register the
>>> atexit command only if GIT_TEST_FSMONITOR is set (to something
>>> watchman-specific), so it won't be invoked at all if
>>> GIT_TEST_FSMONITOR is not set, and thus it won't generate additional
>>> test output and trace.
>>>
>>> I don't have a better idea.
>>
>> Shouldn't it be sufficient to add it into test_done? If the test fails,
>> then we could leave watches open, but that's no worse than we had without
>> this test_clear_watchman method.
> 
> I don't know enough about watchman to have an informed opinion.
> 
> I think the answer mainly depends on what we want to achive and what
> happens when a test script run with GIT_TEST_FSMONITOR exits without
> invoking 'test_done' is re-executed (e.g. after a test case fails with
> '--immediate' or when the user hits ctrl-c or closes the terminal
> window mid-test).
> 
> As far as I understand the commit message of v2 of this patch [1], we
> mainly want two things:
> 
>   - Avoid overloading watchman's watch queue.  For this it might
>     indeed be sufficient to clear watches in 'test_done', because most
>     test scripts tend to succeed most of the time.
> 
>   - Make GIT_TEST_FSMONITOR work reliably on Windows.  For this, I'm
>     afraid it's not enough in general, because a failure with
>     '--immediate' or after a ctrl-c we won't run 'test_done', so we
>     won't clear the watches, and watchman will keep the fd to the
>     trash dir open, and, consequently, will interfere with subsequent
>     executions of the same test script as it can't delete the still
>     existing trash dir left over from the previous run.

You are right. Running an individual test and ending it early would
lead to these leaked handles. This assumes someone is aware of the
GIT_TEST_FSMONITOR environment variable, so they are at least
interacting with the feature directly to some extent.

>     It could still be sufficient for fsmonitor-enabled CI builds,
>     though, because there we don't re-run tests, don't hit ctrl-c, and
>     (at least on Azure Pipelines) don't use '--immediate', and the
>     whole VM/container/whatever is thrown away at end anyway.

This is the hope. It would be nice to get to that point.

> 
>     On Linux/Unix-y systems it probably doesn't matter much, because
>     they can delete open directories, but I wonder what happens with a
>     watch when the directory it is supposed observe gets deleted.  If
>     the watch is removed in this case, great; if it isn't, then...
>     well, then what happens with it?  Will it be overwritten with the
>     next test run, or will there be duplicate watches for the same
>     dir?

When a directory is deleted from under Watchman on Linux, the watch
is removed...eventually. I'm not sure at exactly what point that happens.
At the very least, Watchman will receive and process the signals for all
of the paths being removed inside the directory. Running 'watch-del'
removes that overhead.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux