Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 15, 2013 at 11:52 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> In the current code, we always check if a path is excluded, and when
> dealing with DT_REG/DT_LNK, we call treat_file():
>
>  * When such a path is excluded, treat_file() returns true when we
>    are not showing ignored directories. This causes treat_one_path()
>    to return path_ignored, so for excluded DT_REG/DT_LNK paths when
>    no DIR_*_IGNORED is in effect, this change is a correct
>    optimization.
>
>  * When such a path is not excluded, on the ther hand, and when we
>    are not showing ignored directories, treat_file() just returns
>    the value of exclude_file, which is initialized to false and is
>    not changed in the function.  This causes treat_one_path() to
>    return path_handled.  However, the new code returns path_ignored
>    in this case.
>
> What guarantees that this change is regression free?

If you consider read_directory_recursive alone, there is a regression.
The return value of r_d_r depends on path_handled/path_ignored. With
this patch, the return value will be different. The return value is
only used by treat_directory() in two cases:

 - when DIR_SHOW_IGNORED is set, which disables the optimization so no
regression

 - when DIR_HIDE_EMPTY_DIRECTORIES is _not_ set (and neither is
DIR_SHOW_IGNORED), the optimization is still on and different r_d_r's
return value would lead to different behavior. However I don't think
it can happen.

treat_directory checks if the given directory can be found in index.
In that case the neither r_d_r calls in treat_directory is reachable.
If the given directory cannot be found in the index, the second r_d_r
is reachable. But then the cache_name_exists() in the patch should
always be false (parent not in index, children cannot), so the
optimization is off and r_d_r returns correctly.

It's a bit tricky. I'm not sure if I miss anything else.

> I do not seem
> to be able to find anything that checks if the path is already known
> to the index in the original code for the case you special cased
> (i.e. DIR_*_IGNORED is not set and dtype is not DT_DIR).  Do all the
> callers that reach this function in their callgraph, when they get
> path_ignored for a path in the index, behave as if the difference
> between path_ignored and path_handled does not matter?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]