[PATCH v2 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I reported a couple of times that t5552 is not passing reliably. It has now
reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs a lot 
more often on Windows than elsewhere. (And even if it is apparently
impossible to trigger on Linux.)

The culprit is that two processes try to write simultaneously to the same
file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even when only looking at the POSIX specification (the
documentation of write()
[http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html] says
"Applications should use some form of concurrency control.").

This patch series addresses that by locking the trace fd. I chose to use 
fcntl() for the Unix(-y) platforms we support instead of flock() (even if
the latter has much simpler semantics) because fcntl() is in POSIX while 
flock() is not. On Windows, I use the Win32 API function LockFileEx()
[https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-lockfileex]
.

Of course, I have to admit that I am not super solid on the fcntl() 
semantics. Junio was nice enough to educate me on l_len = 0 meaning the
entire file. If anybody has more experience with locking file descriptors
referring not to files, but, say, to pipes or to an interactive terminal, I
would be very thankful for help (while the POSIX documentation states that
the errno should be EINVAL on a file descriptor that cannot be locked,
apparently macOS sets errno to EBADF when trying to lock a redirected stdout
)?

Changes since v1:

 * Touched up the cover letter and the first commit message (in particular,
   removed the bogus squash! line giving away just how much I rely on the 
   --autosquash feature these days).
 * Now we're locking the entire file via l_len = 0 (except on Windows).
 * To cover all bases, EINVAL is now also treated as "cannot lock this fd"
   (in addition to EBADF).
 * Removed some superfluous flock()-related left-overs from a previous
   attempt (that caused a lot of me fighting with Linux).

Johannes Schindelin (4):
  Introduce a function to lock/unlock file descriptors when appending
  mingw: implement lock_or_unlock_fd_for_appending()
  trace: lock the trace file to avoid racy trace_write() calls
  trace: verify that locking works

 Makefile               |   1 +
 compat/mingw.c         |  19 ++++++
 compat/mingw.h         |   3 +
 git-compat-util.h      |   2 +
 t/helper/test-tool.c   |   1 +
 t/helper/test-tool.h   |   1 +
 t/helper/test-trace.c  | 130 +++++++++++++++++++++++++++++++++++++++++
 t/t0070-fundamental.sh |   6 ++
 trace.c                |  11 +++-
 wrapper.c              |  16 +++++
 10 files changed, 189 insertions(+), 1 deletion(-)
 create mode 100644 t/helper/test-trace.c


base-commit: 42cc7485a2ec49ecc440c921d2eb0cae4da80549
Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-17%2Fdscho%2Ffetch-negotiator-skipping-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-17/dscho/fetch-negotiator-skipping-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/17

Range-diff vs v1:

 1:  e449ed75f ! 1:  a19904682 Introduce a function to lock/unlock file descriptors when appending
     @@ -30,7 +30,7 @@
          added in this commit.
      
          But fcntl() allows a lot more versatile file region locking that we
     -    actually need, to by necessity the fcntl() emulation would be quite
     +    actually need, so by necessity the fcntl() emulation would be quite
          complex: To support the `l_whence = SEEK_CUR` (which we would use, if it
          did not require so much book-keeping due to our writing between locking
          and unlocking the file), we would have to call `SetFilePointerEx()`
     @@ -47,18 +47,16 @@
          to do, as we would once again fail to have an abstraction that clearly
          says what *exact*functionality we want to use.
      
     -    To set a precedent for a better approach, let's introduce a proper
     -    abstraction: a function that says in its name precisely what Git
     -    wants it to do (as opposed to *how* it does it on Linux):
     -    lock_or_unlock_fd_for_appending().
     +    This commit expressly tries to set a precedent for a better approach:
     +    Let's introduce a proper abstraction: a function that says in its name
     +    precisely what Git wants it to do (as opposed to *how* it does it on
     +    Linux): lock_or_unlock_fd_for_appending().
      
          The next commit will provide a Windows-specific implementation of this
          function/functionality.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
      
     -    squash! Introduce a function to lock/unlock file descriptors when appending
     -
      diff --git a/git-compat-util.h b/git-compat-util.h
      --- a/git-compat-util.h
      +++ b/git-compat-util.h
     @@ -86,9 +84,11 @@
      +	struct flock flock;
      +
      +	flock.l_type = lock_it ? F_WRLCK : F_UNLCK;
     ++
     ++	/* (un-)lock the whole file */
      +	flock.l_whence = SEEK_SET;
      +	flock.l_start = 0;
     -+	flock.l_len = 0xffffffff; /* arbitrary number of bytes */
     ++	flock.l_len = 0;
      +
      +	return fcntl(fd, F_SETLKW, &flock);
      +}
 2:  8dda2d530 ! 2:  fa0c282aa mingw: implement lock_or_unlock_fd_for_appending()
     @@ -54,24 +54,3 @@
       #define has_dos_drive_prefix(path) \
       	(isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
       int mingw_skip_dos_drive_prefix(char **path);
     -
     -diff --git a/config.mak.uname b/config.mak.uname
     ---- a/config.mak.uname
     -+++ b/config.mak.uname
     -@@
     - 	NO_POSIX_GOODIES = UnfortunatelyYes
     - 	NATIVE_CRLF = YesPlease
     - 	DEFAULT_HELP_FORMAT = html
     -+	HAVE_FLOCK = YesWeEmulate
     - 
     - 	CC = compat/vcbuild/scripts/clink.pl
     - 	AR = compat/vcbuild/scripts/lib.pl
     -@@
     - 	NO_INET_NTOP = YesPlease
     - 	NO_POSIX_GOODIES = UnfortunatelyYes
     - 	DEFAULT_HELP_FORMAT = html
     -+	HAVE_FLOCK = YesWeEmulate
     -+
     - 	COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32
     - 	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
     - 	COMPAT_OBJS += compat/mingw.o compat/winansi.o \
 3:  a53e72198 ! 3:  1f5c15c7e trace: lock the trace file to avoid racy trace_write() calls
     @@ -14,6 +14,17 @@
          To remedy this, let's lock the file descriptors for exclusive writing,
          using the function we just introduced in the previous commit.
      
     +    Note: while the POSIX documentation of fcntl() at
     +    http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
     +    suggests that the `errno` is set to `EINVAL` when being asked to
     +    lock a file descriptor that cannot be locked, on macOS it results in
     +    `EBADF` when trying to lock a redirected `stdout` (which the
     +    documentation claims should indicate that the file descriptor is not
     +    valid for writing).
     +
     +    To cover all our bases, we simply treat both `EINVAL` and `EBADF` as
     +    indicators that we cannot lock/unlock this file descriptor.
     +
          Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
      
      diff --git a/trace.c b/trace.c
     @@ -27,7 +38,7 @@
      +	int fd = get_trace_fd(key), locked;
      +
      +	locked = !lock_or_unlock_fd_for_appending(fd, 1);
     -+	if (!locked && errno != EBADF)
     ++	if (!locked && errno != EBADF && errno != EINVAL)
      +		warning("unable to lock file descriptor for %s: %s",
      +			key->key, strerror(errno));
      +	if (write_in_full(fd, buf, len) < 0) {
 4:  f57234154 ! 4:  38358ac74 trace: verify that locking works
     @@ -131,13 +131,13 @@
      +		strbuf_rtrim(&buf);
      +		if (!strcmp("lock", buf.buf)) {
      +			if (lock_or_unlock_fd_for_appending(fd, 1) < 0 &&
     -+			    errno != EBADF)
     ++			    errno != EBADF && errno != EINVAL)
      +				die_errno("'%s': lock", child_name);
      +			ACK("lock");
      +			locked = 1;
      +		} else if (!strcmp("unlock", buf.buf)) {
      +			if (lock_or_unlock_fd_for_appending(fd, 0) < 0 &&
     -+			    errno != EBADF)
     ++			    errno != EBADF && errno != EINVAL)
      +				die_errno("'%s': unlock", child_name);
      +			ACK("unlock");
      +			locked = 0;

-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux