Re: [PATCH 11/23] fsmonitor--daemon: define token-ids

Derrick Stolee <stolee@xxxxxxxxx> · Mon, 26 Apr 2021 15:49:17 -0400

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> 
> Teach fsmonitor--daemon to create token-ids and define the
> overall token naming scheme.
...
> +/*
> + * Requests to and from a FSMonitor Protocol V2 provider use an opaque
> + * "token" as a virtual timestamp.  Clients can request a summary of all
> + * created/deleted/modified files relative to a token.  In the response,
> + * clients receive a new token for the next (relative) request.
> + *
> + *
> + * Token Format
> + * ============
> + *
> + * The contents of the token are private and provider-specific.
> + *
> + * For the built-in fsmonitor--daemon, we define a token as follows:
> + *
> + *     "builtin" ":" <token_id> ":" <sequence_nr>
> + *
> + * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
> + * UUID, or {timestamp,pid}.  It is used to group all filesystem
> + * events that happened while the daemon was monitoring (and in-sync
> + * with the filesystem).
> + *
> + *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
> + *     and does not define less-than/greater-than relationships.
> + *     (There are too many race conditions to rely on file system
> + *     event timestamps.)
> + *
> + * The <sequence_nr> is a simple integer incremented for each event
> + * received.  When a new <token_id> is created, the <sequence_nr> is
> + * reset to zero.
> + *
> + *
> + * About Token Ids
> + * ===============
> + *
> + * A new token_id is created:
> + *
> + * [1] each time the daemon is started.
> + *
> + * [2] any time that the daemon must re-sync with the filesystem
> + *     (such as when the kernel drops or we miss events on a very
> + *     active volume).
> + *
> + * [3] in response to a client "flush" command (for dropped event
> + *     testing).
> + *
> + * [4] MAYBE We might want to change the token_id after very complex
> + *     filesystem operations are performed, such as a directory move
> + *     sequence that affects many files within.  It might be simpler
> + *     to just give up and fake a re-sync (and let the client do a
> + *     full scan) than try to enumerate the effects of such a change.
> + *
> + * When a new token_id is created, the daemon is free to discard all
> + * cached filesystem events associated with any previous token_ids.
> + * Events associated with a non-current token_id will never be sent
> + * to a client.  A token_id change implicitly means that the daemon
> + * has gap in its event history.
> + *
> + * Therefore, clients that present a token with a stale (non-current)
> + * token_id will always be given a trivial response.

>From this comment, it seems to be the case that concurrent Git
commands will race to advance the FS Monitor token and one of them
will lose, causing a full working directory scan. There is no list
of "recent" tokens.

I could see this changing in the future, but for now it is a
reasonable simplification.

> + */
> +struct fsmonitor_token_data {
> +	struct strbuf token_id;
> +	struct fsmonitor_batch *batch_head;
> +	struct fsmonitor_batch *batch_tail;
> +	uint64_t client_ref_count;
> +};
> +
> +static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
> +{
> +	static int test_env_value = -1;
> +	static uint64_t flush_count = 0;
> +	struct fsmonitor_token_data *token;
> +
> +	token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));

I think the best practice here is "CALLOC_ARRAY(token, 1);"

> +
> +	strbuf_init(&token->token_id, 0);

This is likely overkill since you used calloc() above.

> +	token->batch_head = NULL;
> +	token->batch_tail = NULL;
> +	token->client_ref_count = 0;
> +
> +	if (test_env_value < 0)
> +		test_env_value = git_env_bool("GIT_TEST_FSMONITOR_TOKEN", 0);
> +
> +	if (!test_env_value) {
> +		struct timeval tv;
> +		struct tm tm;
> +		time_t secs;
> +
> +		gettimeofday(&tv, NULL);
> +		secs = tv.tv_sec;
> +		gmtime_r(&secs, &tm);
> +
> +		strbuf_addf(&token->token_id,
> +			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
> +			    flush_count++,
> +			    getpid(),
> +			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
> +			    tm.tm_hour, tm.tm_min, tm.tm_sec,
> +			    (long)tv.tv_usec);

Between the PID, the flush count, and how deep you go in the
timestamp, this seems to be specific enough.

> +	} else {
> +		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);

And this will be nice for testing.

> +	}
> +
> +	return token;
> +}
> +
>  static ipc_server_application_cb handle_client;
>  
>  static int handle_client(void *data, const char *command,
> @@ -330,7 +436,7 @@ static int fsmonitor_run_daemon(void)
>  
>  	pthread_mutex_init(&state.main_lock, NULL);
>  	state.error_code = 0;
> -	state.current_token_data = NULL;
> +	state.current_token_data = fsmonitor_new_token_data();
>  	state.test_client_delay_ms = 0;
>  
>  	/* Prepare to (recursively) watch the <worktree-root> directory. */
> 

Thanks,
-Stolee