Re: [PATCH v2 7/7] t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Adam,

On Thu, 4 Nov 2021, Adam Dinwoodie wrote:

> On Monday 20 September 2021 at 03:36 pm +0000, Jeff Hostetler via GitGitGadget wrote:
> > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> >
> > Convert test helper to use `start_bg_command()` when spawning a server
> > daemon in the background rather than blocks of platform-specific code.
> >
> > Also, while here, remove _() translation around error messages since
> > this is a test helper and not Git code.
>
> As part of testing the v2.34.0-rc0 release candidate on Cygwin, I've
> found this commit -- 05881a6fc9 (t/helper/simple-ipc: convert
> test-simple-ipc to use start_bg_command, 2021-09-20), according to my
> bisecting -- is causing t0052.1 to fail on 32-bit Cygwin.  Somehow this
> is only affecting the 32-bit Cygwin build; the 64-bit build is working
> as expected.
>
> Specifically, the failure I'm seeing is as below:
>
> ```
> $ GIT_TRACE=1 sh t0052-simple-ipc.sh -vix
> trace: built-in: git init '/cygdrive/d/git-32/t/trash directory.t0052-simple-ipc'
> Initialized empty Git repository in /cygdrive/d/git-32/t/trash directory.t0052-simple-ipc/.git/
> expecting success of 0052.1 'start simple command server':
> 	test_atexit stop_simple_IPC_server &&
> 	test-tool simple-ipc start-daemon --threads=8 &&
> 	test-tool simple-ipc is-active
>
> ++ test_atexit stop_simple_IPC_server
> ++ test 0 = 0
> ++ test_atexit_cleanup='{ stop_simple_IPC_server
> 		} && (exit "$eval_ret"); eval_ret=$?; :'
> ++ test-tool simple-ipc start-daemon --threads=8
> trace: run_command: test-tool simple-ipc run-daemon --name=ipc-test --threads=8
> error: daemon failed to start
> error: last command exited with $?=1
> not ok 1 - start simple command server
> #
> #		test_atexit stop_simple_IPC_server &&
> #		test-tool simple-ipc start-daemon --threads=8 &&
> #		test-tool simple-ipc is-active
> #
> ++ stop_simple_IPC_server
> ++ test-tool simple-ipc stop-daemon
> ++ exit 1
> ++ eval_ret=1
> ++ :
> ```
>
> I've had a look at the code changes, and cannot work out what might be
> being handled differently in 32-bit and 64-bit Cygwin environments.
> Given the Cygwin project is considering dropping support for 32-bit
> Cygwin anyway, it might not be worth doing anything about this.  But I
> thought it worth reporting in case there's something obvious to folk
> more familiar with this code.

I had a look at this and could reproduce... partially. I only managed to
make it fail every once in a while.

Digging deeper, it turns out that the `lstat()` call in
`ipc_get_active_state()` does not receive an `st_mode` indicating a
socket, but rather a file (in my tests, it was usually 0100644, but
sometimes even 0100755).

The reason is, of course, that Cygwin _emulates_ Unix sockets. What is in
the file system is just a special file, it is marked with the `system` bit
(which only exists on Windows), and its contents start with the tell-tale
`!<socket>`.

And as you might have guessed, there is a race going on between Cygwin
writing that file _and_ flipping that `system` bit, and Git trying to
access the Unix socket and encountering an unexpected file.

Now, why this only happens in your 32-bit setup, I have no idea.

In my tests, the following patch works around the issue. Could I ask you
to test it in your environment?

-- snip --
diff --git a/compat/simple-ipc/ipc-unix-socket.c
b/compat/simple-ipc/ipc-unix-socket.c
index 4e28857a0a..1c591b2adf 100644
--- a/compat/simple-ipc/ipc-unix-socket.c
+++ b/compat/simple-ipc/ipc-unix-socket.c
@@ -36,6 +36,23 @@ enum ipc_active_state ipc_get_active_state(const char
*path)
 	}

 	/* also complain if a plain file is in the way */
+#ifdef __CYGWIN__
+	{
+		static const int delay[] = { 1, 10, 20, 40, -1 };
+		int i;
+
+		for (i = 0; S_ISREG(st.st_mode) && delay[i] > 0; i++) {
+			/*
+			 * Cygwin might still be in the process of marking the
+			 * underlying file as a system file.
+			 */
+			sleep_millisec(delay[i]);
+			if (lstat(path, &st) == -1)
+				return IPC_STATE__INVALID_PATH;
+		}
+	}
+#endif
+
 	if ((st.st_mode & S_IFMT) != S_IFSOCK)
 		return IPC_STATE__INVALID_PATH;

-- snap --

FWIW it looks as if the loop might be a bit of an overkill, as I could not
get the code to need more than a single one-millisecond sleep, but it's
probably safer to just keep the delay loop in place as-is.

Ciao,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux