Re: [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> 
> Teach fsmonitor--daemon to periodically truncate the list of
> modified files to save some memory.
> 
> Clients will ask for the set of changes relative to a token that they
> found in the FSMN index extension in the index.  (This token is like a
> point in time, but different).  Clients will then update the index to
> contain the response token (so that subsequent commands will be
> relative to this new token).
> 
> Therefore, the daemon can gradually truncate the in-memory list of
> changed paths as they become obsolete (older that the previous token).

s/older that/older than/

> Since we may have multiple clients making concurrent requests with a
> skew of tokens and clients may be racing to the talk to the daemon,
> we lazily truncate the list.
> 
> We introduce a 5 minute delay and truncate batches 5 minutes after
> they are considered obsolete.

5 minutes seems like a good default timeframe. We can consider
making this customizable in the future.

> +/*
> + * To keep the batch list from growing unbounded in response to filesystem
> + * activity, we try to truncate old batches from the end of the list as
> + * they become irrelevant.
> + *
> + * We assume that the .git/index will be updated with the most recent token
> + * any time the index is updated.  And future commands will only ask for
> + * recent changes *since* that new token.  So as tokens advance into the
> + * future, older batch items will never be requested/needed.  So we can
> + * truncate them without loss of functionality.
> + *
> + * However, multiple commands may be talking to the daemon concurrently
> + * or perform a slow command, so a little "token skew" is possible.
> + * Therefore, we want this to be a little bit lazy and have a generous
> + * delay.

I appreciate this documentation of the "expected" behavior and how it
compares to the "possible" behavior.

> + * The current reader thread walked backwards in time from `token->batch_head`
> + * back to `batch_marker` somewhere in the middle of the batch list.
> + *
> + * Let's walk backwards in time from that marker an arbitrary delay
> + * and truncate the list there.  Note that these timestamps are completely
> + * artificial (based on when we pinned the batch item) and not on any
> + * filesystem activity.
> + */
> +#define MY_TIME_DELAY (5 * 60) /* seconds */

Perhaps put the units into the macro? MY_TIME_DELAY_SECONDS?
> +static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
> +				      const struct fsmonitor_batch *batch_marker)
> +{
> +	/* assert state->main_lock */

If this comment is intended to be a warning for consumers that they should
have the lock around this method, then maybe that should be in a documentation
comment above the method declaration.

> +	const struct fsmonitor_batch *batch;
> +	struct fsmonitor_batch *rest;
> +	struct fsmonitor_batch *p;
> +	time_t t;

This is only used within the for loop, so it could be defined there.

> +
> +	if (!batch_marker)
> +		return;
> +
> +	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",

What's the value of abbreviating "truncate" like this? Is there a special
reason?

> +			 batch_marker->batch_seq_nr,
> +			 (uint64_t)batch_marker->pinned_time);
> +
> +	for (batch = batch_marker; batch; batch = batch->next) {
> +		if (!batch->pinned_time) /* an overflow batch */
> +			continue;
> +
> +		t = batch->pinned_time + MY_TIME_DELAY;
> +		if (t > batch_marker->pinned_time) /* too close to marker */
> +			continue;> +
> +		goto truncate_past_here;
> +	}
> +
> +	return;
> +
> +truncate_past_here:
> +	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
> +
> +	rest = ((struct fsmonitor_batch *)batch)->next;
> +	((struct fsmonitor_batch *)batch)->next = NULL;
> +
> +	for (p = rest; p; p = fsmonitor_batch__free(p)) {
> +		trace_printf_key(&trace_fsmonitor,
> +				 "TRNC kill (%"PRIu64",%"PRIu64")",
> +				 p->batch_seq_nr, (uint64_t)p->pinned_time);
> +	}

I see that you are not using the method that frees the entire list so
you can trace each entry as it is deleted. That works.

> +}
> +
>  static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
>  {
>  	struct fsmonitor_batch *p;
> @@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  			 * that work.
>  			 */
>  			fsmonitor_free_token_data(token_data);
> +		} else if (batch) {
> +			/*
> +			 * This batch is the first item in the list
> +			 * that is older than the requested sequence
> +			 * number and might be considered to be
> +			 * obsolete.  See if we can truncate the list
> +			 * and save some memory.
> +			 */
> +			fsmonitor_batch__truncate(state, batch);

Seems to work as advertised.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux