Re: [PATCH] fsmonitor: option to allow fsmonitor to run against network-mounted repos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

On Thu, 18 Aug 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@xxxxxxxxxxxxx>
>
> Though perhaps not common, there are uses cases where users have large,
> network-mounted repos. Having the ability to run fsmonitor against
> network paths would benefit those users.
>
> As a first step towards enabling fsmonitor to work against
> network-mounted repos, a configuration option, 'fsmonitor.allowRemote'
> was introduced for Windows.

If you start the commit message along the following lines, it might be
easier/quicker to grok the context for the keen reader:

	In 85dc0da6dcf (fsmonitor: option to allow fsmonitor to run against
	network-mounted repos, 2022-08-11), the Windows backend of the
	FSMonitor learned to allow running on network drives, via the
	`fsmonitor.allowRemote` config setting.

> Setting this option to true will override the default behavior
> (erroring-out) when a network-mounted repo is detected by fsmonitor. In
> order for macOS to have parity with Windows, the same option is now
> introduced for macOS.
>
> The the added wrinkle being that the Unix domain socket (UDS) file
> used for IPC cannot be created in a network location; instead the
> temporary directory is used.

Thank you very much for this note, after a cursory read I expected that
part of the code to be a left-over from some "We know better than the
user" type of automatic default, and this paragraph definitely helped me
overcome that expectation.

>
> Signed-off-by: Eric DeCosta <edecosta@xxxxxxxxxxxxx>
> ---
>     fsmonitor: option to allow fsmonitor to run against network-mounted
>     repos
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1326%2Fedecosta-mw%2Ffsmonitor_macos-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1326/edecosta-mw/fsmonitor_macos-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1326
>
>  compat/fsmonitor/fsm-settings-darwin.c | 77 ++++++++++++++++++++++----
>  fsmonitor-ipc.c                        | 47 +++++++++++++++-
>  fsmonitor-ipc.h                        |  6 ++
>  3 files changed, 117 insertions(+), 13 deletions(-)

I am somewhat puzzled that this has no corresponding change to
`Documentation/`.

And now I realize that this was the case also for the patch adding
`fsmonitor.allowRemote` support for Windows.

Could I ask you to add a patch to document this config setting?

>
> diff --git a/compat/fsmonitor/fsm-settings-darwin.c b/compat/fsmonitor/fsm-settings-darwin.c
> index efc732c0f31..9e2ea3b90cc 100644
> --- a/compat/fsmonitor/fsm-settings-darwin.c
> +++ b/compat/fsmonitor/fsm-settings-darwin.c
> @@ -2,10 +2,28 @@
>  #include "config.h"
>  #include "repository.h"
>  #include "fsmonitor-settings.h"
> +#include "fsmonitor-ipc.h"
>  #include "fsmonitor.h"
>  #include <sys/param.h>
>  #include <sys/mount.h>
>
> +/*
> + * Check if monitoring remote working directories is allowed.
> + *
> + * By default, monitoring remote working directories is
> + * disabled.  Users may override this behavior in enviroments where
> + * they have proper support.
> + */
> +static int check_config_allowremote(struct repository *r)
> +{
> +	int allow;
> +
> +	if (!repo_config_get_bool(r, "fsmonitor.allowremote", &allow))
> +		return allow;
> +
> +	return -1; /* fsmonitor.allowremote not set */
> +}
> +
>  /*
>   * [1] Remote working directories are problematic for FSMonitor.
>   *
> @@ -27,24 +45,22 @@
>   * In theory, the above issues need to be addressed whether we are
>   * using the Hook or IPC API.
>   *
> + * So (for now at least), mark remote working directories as
> + * incompatible by default.
> + *

This was moved up, okay.

>   * For the builtin FSMonitor, we create the Unix domain socket for the
> - * IPC in the .git directory.  If the working directory is remote,
> - * then the socket will be created on the remote file system.  This
> - * can fail if the remote file system does not support UDS file types
> - * (e.g. smbfs to a Windows server) or if the remote kernel does not
> - * allow a non-local process to bind() the socket.  (These problems
> - * could be fixed by moving the UDS out of the .git directory and to a
> - * well-known local directory on the client machine, but care should
> - * be taken to ensure that $HOME is actually local and not a managed
> - * file share.)
> + * IPC in the temporary directory.  If the temporary directory is

This is incorrect. It is still the `.git` directory in the common case,
not a temporary directory.

> + * remote, then the socket will be created on the remote file system.
> + * This can fail if the remote file system does not support UDS file
> + * types (e.g. smbfs to a Windows server) or if the remote kernel does
> + * not allow a non-local process to bind() the socket.
>   *
> - * So (for now at least), mark remote working directories as
> - * incompatible.
> + * Therefore remote UDS locations are marked as incompatible.
>   *
>   *
>   * [2] FAT32 and NTFS working directories are problematic too.

Doesn't this patch address this, too? See below for more on that.

>   *
> - * The builtin FSMonitor uses a Unix domain socket in the .git
> + * The builtin FSMonitor uses a Unix domain socket in the temporary
>   * directory for IPC.  These Windows drive formats do not support
>   * Unix domain sockets, so mark them as incompatible for the daemon.
>   *
> @@ -65,6 +81,39 @@ static enum fsmonitor_reason check_volume(struct repository *r)
>  			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
>  			 r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename);
>
> +	if (!(fs.f_flags & MNT_LOCAL)) {
> +		switch (check_config_allowremote(r)) {
> +		case 0: /* config overrides and disables */
> +			return FSMONITOR_REASON_REMOTE;
> +		case 1: /* config overrides and enables */
> +			return FSMONITOR_REASON_OK;
> +		default:
> +			break; /* config has no opinion */
> +		}
> +
> +		return FSMONITOR_REASON_REMOTE;
> +	}

This `switch()` statement sounds like a verbose way to say the same as:

		return check_config_allowremote(r) == 1 ?
			FSMONITOR_REASON_OK : FSMONITOR_REASON_REMOTE;

> +
> +	return FSMONITOR_REASON_OK;
> +}
> +
> +static enum fsmonitor_reason check_uds_volume(void)

What's an UDS volume? Do you mean to say "Unix Domain Socket Volume"?

If so, it would be better to turn this into a function called
`filesystem_supports_unix_sockets()` and to return an `int`, 1 for "yes",
0 for "no".

> +{
> +	struct statfs fs;
> +	const char *path = fsmonitor_ipc__get_path();
> +
> +	if (statfs(path, &fs) == -1) {
> +		int saved_errno = errno;
> +		trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s",
> +				 path, strerror(saved_errno));
> +		errno = saved_errno;
> +		return FSMONITOR_REASON_ERROR;
> +	}
> +
> +	trace_printf_key(&trace_fsmonitor,
> +			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
> +			 path, fs.f_type, fs.f_flags, fs.f_fstypename);
> +
>  	if (!(fs.f_flags & MNT_LOCAL))
>  		return FSMONITOR_REASON_REMOTE;
>
> @@ -85,5 +134,9 @@ enum fsmonitor_reason fsm_os__incompatible(struct repository *r)

It is unfortunate that the diff hunk stops here, and mails lack a button
to increase the diff context. In this instance, the hidden part of the
`check_volume()` function is quite interesting: it returns
`FSMONITOR_REASON_NOSOCKETS` for `msdos` and `ntfs` file systems.

Which means that your patch changes behavior not only for remote file
systems, but also for local ones without support for Unix sockets.

To heed the principle of separation of concerns, please do split out that
part. I would recommend to make it the first patch to support
`msdos`/`ntfs` file systems (by registering the Unix sockets in a
temporary directory instead of the `.git/` directory). The second patch
can then introduce support for `fsmonitor.allowRemote` on macOS on top of
the first patch.

>  	if (reason != FSMONITOR_REASON_OK)
>  		return reason;
>
> +	reason = check_uds_volume();
> +	if (reason != FSMONITOR_REASON_OK)
> +		return reason;
> +
>  	return FSMONITOR_REASON_OK;
>  }
> diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
> index 789e7397baa..6e9b40a03d5 100644
> --- a/fsmonitor-ipc.c
> +++ b/fsmonitor-ipc.c
> @@ -4,6 +4,7 @@
>  #include "fsmonitor-ipc.h"
>  #include "run-command.h"
>  #include "strbuf.h"
> +#include "tempfile.h"
>  #include "trace2.h"
>
>  #ifndef HAVE_FSMONITOR_DAEMON_BACKEND
> @@ -47,7 +48,51 @@ int fsmonitor_ipc__is_supported(void)
>  	return 1;
>  }
>
> -GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc")
> +GIT_PATH_FUNC(fsmonitor_ipc__get_pathfile, "fsmonitor--daemon.ipc")

Why rename this? That's unnecessary chatter in the patch. Let's avoid such
things in the future, it only costs reviewers time.

> +
> +static char *gen_ipc_file(void)
> +{
> +	char *retval = NULL;
> +	struct tempfile *ipc;
> +
> +	const char *ipc_file = fsmonitor_ipc__get_pathfile();
> +	FILE *fp = fopen(ipc_file, "w");
> +
> +	if (!fp)
> +		die_errno("error opening '%s'", ipc_file);
> +	ipc = mks_tempfile_t("fsmonitor_ipc_XXXXXX");
> +	strbuf_write(&ipc->filename, fp);
> +	fclose(fp);
> +	retval = strbuf_detach(&ipc->filename, NULL);
> +	strbuf_release(&ipc->filename);
> +	return retval;
> +}
> +
> +const char *fsmonitor_ipc__get_path(void)
> +{
> +	char *retval = NULL;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	const char *ipc_file = fsmonitor_ipc__get_pathfile();
> +	FILE *fp = fopen(ipc_file, "r");
> +
> +	if (!fp) {
> +		return gen_ipc_file();
> +	} else {
> +		strbuf_read(&sb, fileno(fp), 0);
> +		fclose(fp);
> +		fp = fopen(sb.buf, "r");
> +		if (!fp) { /* generate new file */
> +			if (unlink(ipc_file) < 0)
> +				die_errno("could not remove '%s'", ipc_file);
> +			return gen_ipc_file();
> +		}
> +		fclose(fp);
> +		retval = strbuf_detach(&sb, NULL);
> +		strbuf_release(&sb);
> +		return retval;
> +	}
> +}

I am afraid I do not understand how this code can guarantee a fixed path
for the Unix domain socket.

It _needs_ to be fixed so that a singleton daemon can run and listen on
it, and an arbitrary number of Git clients can connect to it.

If it is not fixed, you will cause Git to quite possibly start a new
FSMonitor daemon for every invocation that wants to connect to an
FSMonitor daemon.

This means that the path of the Unix socket needs to have a 1:1
relationship to the path of the `.git/` directory. If you install it in
that directory, that invariant is naturally fulfilled. If you want to
install it elsewhere, you will have to come up with a reliable way to
guarantee that connection.

One option would be to install the Unix sockets in the home directory,
under a name like `.git-fsmonitor-<hash>` where the <hash> is e.g. a
SHA-1/SHA-256 of the canonicalized path of the `.git/` directory.

>
>  enum ipc_active_state fsmonitor_ipc__get_state(void)
>  {
> diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h
> index b6a7067c3af..63277dea39e 100644
> --- a/fsmonitor-ipc.h
> +++ b/fsmonitor-ipc.h
> @@ -18,6 +18,12 @@ int fsmonitor_ipc__is_supported(void);
>   */
>  const char *fsmonitor_ipc__get_path(void);
>
> +/*
> + * Returns the pathname to the file that contains the pathname to the
> + * IPC named pipe or Unix domain socket.
> + */
> +const char *fsmonitor_ipc__get_pathfile(void);
> +
>  /*
>   * Try to determine whether there is a `git-fsmonitor--daemon` process
>   * listening on the IPC pipe/socket.

Thank you for working on this, also on the Windows side. It definitely
helps!

Ciao,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux