On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > Teach fsmonitor--daemon to periodically truncate the list of > modified files to save some memory. > > Clients will ask for the set of changes relative to a token that they > found in the FSMN index extension in the index. (This token is like a > point in time, but different). Clients will then update the index to > contain the response token (so that subsequent commands will be > relative to this new token). > > Therefore, the daemon can gradually truncate the in-memory list of > changed paths as they become obsolete (older that the previous token). s/older that/older than/ > Since we may have multiple clients making concurrent requests with a > skew of tokens and clients may be racing to the talk to the daemon, > we lazily truncate the list. > > We introduce a 5 minute delay and truncate batches 5 minutes after > they are considered obsolete. 5 minutes seems like a good default timeframe. We can consider making this customizable in the future. > +/* > + * To keep the batch list from growing unbounded in response to filesystem > + * activity, we try to truncate old batches from the end of the list as > + * they become irrelevant. > + * > + * We assume that the .git/index will be updated with the most recent token > + * any time the index is updated. And future commands will only ask for > + * recent changes *since* that new token. So as tokens advance into the > + * future, older batch items will never be requested/needed. So we can > + * truncate them without loss of functionality. > + * > + * However, multiple commands may be talking to the daemon concurrently > + * or perform a slow command, so a little "token skew" is possible. > + * Therefore, we want this to be a little bit lazy and have a generous > + * delay. I appreciate this documentation of the "expected" behavior and how it compares to the "possible" behavior. > + * The current reader thread walked backwards in time from `token->batch_head` > + * back to `batch_marker` somewhere in the middle of the batch list. > + * > + * Let's walk backwards in time from that marker an arbitrary delay > + * and truncate the list there. Note that these timestamps are completely > + * artificial (based on when we pinned the batch item) and not on any > + * filesystem activity. > + */ > +#define MY_TIME_DELAY (5 * 60) /* seconds */ Perhaps put the units into the macro? MY_TIME_DELAY_SECONDS? > +static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state, > + const struct fsmonitor_batch *batch_marker) > +{ > + /* assert state->main_lock */ If this comment is intended to be a warning for consumers that they should have the lock around this method, then maybe that should be in a documentation comment above the method declaration. > + const struct fsmonitor_batch *batch; > + struct fsmonitor_batch *rest; > + struct fsmonitor_batch *p; > + time_t t; This is only used within the for loop, so it could be defined there. > + > + if (!batch_marker) > + return; > + > + trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")", What's the value of abbreviating "truncate" like this? Is there a special reason? > + batch_marker->batch_seq_nr, > + (uint64_t)batch_marker->pinned_time); > + > + for (batch = batch_marker; batch; batch = batch->next) { > + if (!batch->pinned_time) /* an overflow batch */ > + continue; > + > + t = batch->pinned_time + MY_TIME_DELAY; > + if (t > batch_marker->pinned_time) /* too close to marker */ > + continue;> + > + goto truncate_past_here; > + } > + > + return; > + > +truncate_past_here: > + state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch; > + > + rest = ((struct fsmonitor_batch *)batch)->next; > + ((struct fsmonitor_batch *)batch)->next = NULL; > + > + for (p = rest; p; p = fsmonitor_batch__free(p)) { > + trace_printf_key(&trace_fsmonitor, > + "TRNC kill (%"PRIu64",%"PRIu64")", > + p->batch_seq_nr, (uint64_t)p->pinned_time); > + } I see that you are not using the method that frees the entire list so you can trace each entry as it is deleted. That works. > +} > + > static void fsmonitor_free_token_data(struct fsmonitor_token_data *token) > { > struct fsmonitor_batch *p; > @@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state, > * that work. > */ > fsmonitor_free_token_data(token_data); > + } else if (batch) { > + /* > + * This batch is the first item in the list > + * that is older than the requested sequence > + * number and might be considered to be > + * obsolete. See if we can truncate the list > + * and save some memory. > + */ > + fsmonitor_batch__truncate(state, batch); Seems to work as advertised. Thanks, -Stolee