Re: [PATCH] fsmonitor OSX: fix hangs for submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Koji Nakamaru via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Koji Nakamaru <koji.nakamaru@xxxxxxxx>
>
> fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> the value with trailing "/." (as repo_get_git_dir(the_repository) on
> Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> IS_OUTSIDE_CONE.
>
> In this case, fsevent_callback() doesn't update cookie_list so that
> fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> not invoked.
>
> As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> that with_lock__mark_cookies_seen() should unlock, the whole daemon
> hangs.

The above very nicely describes the cause, the mechansim that leads
to the end-user observable effect, and the (bad) effect the bug has.

I wish everybody wrote their proposed commit messages like this ;-)

> Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> and add a corresponding test in t7527-builtin-fsmonitor.sh.
>
> Helped-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
> Signed-off-by: Koji Nakamaru <koji.nakamaru@xxxxxxxx>
> ---
>     fsmonitor/darwin: fix hangs for submodules

> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 730f3c7f810..7acd074a97f 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -82,6 +82,28 @@ have_t2_data_event () {
>  	grep -e '"event":"data".*"category":"'"$c"'".*"key":"'"$k"'"'
>  }
>  
> +start_git_in_background () {
> +	git "$@" &
> +	git_pid=$!
> +	nr_tries_left=10
> +	while true
> +	do
> +		if test $nr_tries_left -eq 0
> +		then
> +			kill $git_pid
> +			exit 1
> +		fi
> +		sleep 1
> +		nr_tries_left=$(($nr_tries_left - 1))
> +	done > /dev/null 2>&1 &

So, the command is allowed to run for 10 seconds and then a signal
is sent to the process (by the way, we do not write the SP between
">" and "/dev/null").

> +	watchdog_pid=$!
> +	wait $git_pid

And the process to ensure the command gets killed in 10 seconds is
called the "watchdog".  We let the command run for completion (and
we'd be happy if it did without watchdog needing to forcibly kill
it).

Which means that even after the test finishes normally (e.g., the
command completes without getting killed by the watchdog, because it
is on a fast box and finishes in 0.5 second), we have leftover
watchdog process hanging around for 10 seconds, which might interfere
with the removal of the $TRASH_DIRECTORY at the end of the test.

There is a helper function to kill both (below), which probably is
used to avoid it.  Let's keep reading.

> +}
> +
> +stop_git_and_watchdog () {
> +	kill $git_pid $watchdog_pid
> +}

This sends a signal and let the process die.  Without waiting to
make sure they indeed died, at which point we can safely remove the
$TRASH_DIRECTORY on filesystems that refuse to remove a directory
when a process still has it as its current working directory.

Shouldn't it loop, like

	for pid in $git_pid $watchdog_pid
	do
                until kill -0 $pid
                do
                        kill $pid
                done
	done

or something?  Or is there a mechanism already to ensure that we
return after they get killed that I am failing to find?

>  test_expect_success 'explicit daemon start and stop' '
>  	test_when_finished "stop_daemon_delete_repo test_explicit" &&
>  
> @@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
>  	test_subcommand git fsmonitor--daemon start <super-sub.trace
>  '
>  
> +test_expect_success "submodule implicitly starts daemon by pull" '
> +	test_atexit "stop_git_and_watchdog" &&

Hmph, this is _atexit and not _when_finished because...?

> +	test_when_finished "rm -rf cloned; \
> +			    rm -rf super; \
> +			    rm -rf sub" &&

Makes me wonder why it is not written like so:

	test_when_finished "rm -rf cloned super sub" &&

which is short enough to still fit on a line.  Is there something I
am missing that these directories must be removed separately and in
this order?

> +	create_super super &&
> +	create_sub sub &&
> +
> +	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> +	git -C super commit -m "add sub" &&
> +	git clone --recurse-submodules super cloned &&
> +
> +	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> +	start_git_in_background -C cloned pull --recurse-submodules
> +'

Other than that, very nicely done.

Thanks.

>  # On a case-insensitive file system, confirm that the daemon
>  # notices when the .git directory is moved/renamed/deleted
>  # regardless of how it is spelled in the FS event.
>
> base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux