Re: [PATCH v6 20/28] fsmonitor: optimize processing of directory events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jeff,

On Fri, 22 Apr 2022, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
>
> Teach Git to perform binary search over the cache-entries for a directory
> notification and then linearly scan forward to find the immediate children.
>
> Previously, when the FSMonitor reported a modified directory Git would
> perform a linear search on the entire cache-entry array for all
> entries matching that directory prefix and invalidate them.  Since the
> cache-entry array is already sorted, we can use a binary search to
> find the first matching entry and then only linearly walk forward and
> invalidate entries until the prefix changes.
>
> Also, the original code would invalidate anything having the same
> directory prefix.  Since a directory event should only be received for
> items that are immediately within the directory (and not within
> sub-directories of it), only invalidate those entries and not the
> whole subtree.
>
> Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> ---
>  fsmonitor.c | 71 ++++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 54 insertions(+), 17 deletions(-)
>
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 292a6742b4f..e1229c289cf 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -184,30 +184,68 @@ static int query_fsmonitor_hook(struct repository *r,
>  static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
>  {
>  	int i, len = strlen(name);
> -	if (name[len - 1] == '/') {
> +	int pos = index_name_pos(istate, name, len);
> +
> +	trace_printf_key(&trace_fsmonitor,
> +			 "fsmonitor_refresh_callback '%s' (pos %d)",
> +			 name, pos);
>
> +	if (name[len - 1] == '/') {
>  		/*
> -		 * TODO We should binary search to find the first path with
> -		 * TODO this directory prefix.  Then linearly update entries
> -		 * TODO while the prefix matches.  Taking care to search without
> -		 * TODO the trailing slash -- because '/' sorts after a few
> -		 * TODO interesting special chars, like '.' and ' '.
> +		 * The daemon can decorate directory events, such as
> +		 * moves or renames, with a trailing slash if the OS
> +		 * FS Event contains sufficient information, such as
> +		 * MacOS.
> +		 *
> +		 * Use this to invalidate the entire cone under that
> +		 * directory.
> +		 *
> +		 * We do not expect an exact match because the index
> +		 * does not normally contain directory entries, so we
> +		 * start at the insertion point and scan.
>  		 */
> +		if (pos < 0)
> +			pos = -pos - 1;
>
>  		/* Mark all entries for the folder invalid */
> -		for (i = 0; i < istate->cache_nr; i++) {
> -			if (istate->cache[i]->ce_flags & CE_FSMONITOR_VALID &&
> -			    starts_with(istate->cache[i]->name, name))
> -				istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
> +		for (i = pos; i < istate->cache_nr; i++) {
> +			if (!starts_with(istate->cache[i]->name, name))
> +				break;
> +			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
>  		}
> -		/* Need to remove the / from the path for the untracked cache */
> +
> +		/*
> +		 * We need to remove the traling "/" from the path
> +		 * for the untracked cache.
> +		 */
>  		name[len - 1] = '\0';
> +	} else if (pos >= 0) {
> +		/*
> +		 * We have an exact match for this path and can just
> +		 * invalidate it.
> +		 */
> +		istate->cache[pos]->ce_flags &= ~CE_FSMONITOR_VALID;
>  	} else {
> -		int pos = index_name_pos(istate, name, strlen(name));
> -
> -		if (pos >= 0) {
> -			struct cache_entry *ce = istate->cache[pos];
> -			ce->ce_flags &= ~CE_FSMONITOR_VALID;
> +		/*
> +		 * The path is not a tracked file -or- it is a
> +		 * directory event on a platform that cannot
> +		 * distinguish between file and directory events in
> +		 * the event handler, such as Windows.
> +		 *
> +		 * Scan as if it is a directory and invalidate the
> +		 * cone under it.  (But remember to ignore items
> +		 * between "name" and "name/", such as "name-" and
> +		 * "name.".
> +		 */
> +		pos = -pos - 1;
> +
> +		for (i = pos; i < istate->cache_nr; i++) {
> +			if (!starts_with(istate->cache[i]->name, name))
> +				break;
> +			if ((unsigned char)istate->cache[i]->name[len] > '/')

Nice attention to detail casting `istate->cache[i]->name[len]` to
`(unsigned char)` before comparing to '/'!

> +				break;
> +			if (istate->cache[i]->name[len] == '/')
> +				istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
>  		}
>  	}
>
> @@ -215,7 +253,6 @@ static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
>  	 * Mark the untracked cache dirty even if it wasn't found in the index
>  	 * as it could be a new untracked file.
>  	 */
> -	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);

Did you mean to remove this statement in this patch? Not a big issue, but
I wonder what the rationale for it is, and since I have an inquisitive
mind, I figured I'd just ask.

Thanks,
Dscho

>  	untracked_cache_invalidate_path(istate, name, 0);
>  }
>
> --
> gitgitgadget
>
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux