Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 1, 2024 at 8:57 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:

> > fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> > has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> > the value with trailing "/." (as repo_get_git_dir(the_repository) on
> > Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> > IS_OUTSIDE_CONE.
> >
> > In this case, fsevent_callback() doesn't update cookie_list so that
> > fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> > not invoked.
> >
> > As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> > that with_lock__mark_cookies_seen() should unlock, the whole daemon
> > hangs.
> >
> > Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> > and add a corresponding test in t7527-builtin-fsmonitor.sh.
> >
> > Suggested-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
> > Suggested-by: Junio C Hamano <gitster@xxxxxxxxx>
>
> In none of the changes described above, I have any input to deserve
> such credit, though.

Your points are very helpful :)

> > +start_git_in_background () {
> > + git "$@" &
> > + git_pid=$!
> > + git_pgid=$(ps -o pgid= -p $git_pid)
> > + nr_tries_left=10
> > + while true
> > + do
> > + if test $nr_tries_left -eq 0
> > + then
> > + kill -- -$git_pgid
> > + exit 1
> > + fi
> > + sleep 1
> > + nr_tries_left=$(($nr_tries_left - 1))
> > + done >/dev/null 2>&1 &
> > + watchdog_pid=$!
> > + wait $git_pid
> > +}
> > +
> > +stop_git () {
> > + while kill -0 -- -$git_pgid
> > + do
> > + kill -- -$git_pgid
> > + sleep 1
> > + done
> > +}
>
> On the "git" side you use process group because you expect that
> "git" would spawn subprocesses and you want to catch all of them,
> ...
>
> > +stop_watchdog () {
> > + while kill -0 $watchdog_pid
> > + do
> > + kill $watchdog_pid
> > + sleep 1
> > + done
> > +}
>
> ... but "watchdog" you know is a single process, so you'd only need
> a single process id, is that the idea?

Yes, that is the idea.

> What is the motivation behind the change in this iteration to use
> process group?  Was it observed that leftover processes hang around
> if we killed only the $git_pid, or something?

Yes, if the issue occurs, three processes remains:

  git fetch --update-head-ok --recurse-submodules=on

  git fetch --no-prune --no-prune-tags --update-head-ok
    --recurse-submodules --recurse-submodules-default yes
    --submodule-prefix=dir_1/dir_2/sub/

  git fsmonitor--daemon run --detach --ipc-threads=8

If there is no issue, only the fsmonitor process remains.

> > +test_expect_success "submodule implicitly starts daemon by pull" '
> > + test_atexit "stop_watchdog" &&
> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>
> If stop_git ever returns with non-zero status, "rm -rf" will be
> skipped, which I am not sure is a good idea.
>
> The whole test_when_finished would fail in such a case, so you would
> notice the problem right away, which is a plus, though.

t/README discusses that test_when_finished and test_atexit differ about
the "--immediate" option. As git and its subprocesses are the test
target, I moved stop_git to the current place. This might be however
confusing when someone later reads this test. Should we simply put
stop_git and stop_watchdong in test_atexit?

> > + create_super super &&
> > + create_sub sub &&
> > +
> > + git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> > + git -C super commit -m "add sub" &&
> > + git clone --recurse-submodules super cloned &&
> > +
> > + git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> > + set -m &&
>
> I have to wonder how portable (and necessary) this is.
>
> POSIX says it shall be supported if the implementation supports the
> User Portability Utilities option.  It also says that it was added
> to apply only to the UPE because it applies primarily to interactive
> use, not shell script applications.  And our test scripts are of
> course not interactive.

How about the following modification? It still utilizes $git_pgid to
filter processes, but avoids "set -m".

  diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
  index 2dd1ca1a7b..23d9a7c953 100755
  --- a/t/t7527-builtin-fsmonitor.sh
  +++ b/t/t7527-builtin-fsmonitor.sh
  @@ -916,7 +916,7 @@ start_git_in_background () {
          do
                  if test $nr_tries_left -eq 0
                  then
  -                       kill -- -$git_pgid
  +                       kill $git_pid
                          exit 1
                  fi
                  sleep 1
  @@ -927,10 +927,13 @@ start_git_in_background () {
   }

   stop_git () {
  -       while kill -0 -- -$git_pgid
  +       for p in $(ps -o pgid=,pid=,comm= | grep "^$git_pgid .*git"
| sed 's/^[0-9][0-9]* \([0-9][0-9]*\) .*/\1/')
          do
  -               kill -- -$git_pgid
  -               sleep 1
  +               while kill -0 $p
  +               do
  +                       kill $p
  +                       sleep 1
  +               done
          done
   }

  @@ -954,7 +957,6 @@ test_expect_success "submodule implicitly starts
daemon by pull" '
          git clone --recurse-submodules super cloned &&

          git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
  -       set -m &&
          start_git_in_background -C cloned pull --recurse-submodules
   '


Koji Nakamaru





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux