Re: [PATCH v2 1/3] read-cache: optionally collect pathspec matching info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ghanshyam Thakkar <shyamthakkar001@xxxxxxxxx> writes:

> The add_files_to_cache() adds files to the index. And
> add_files_to_cache() in turn calls run_diff_files() to perform this
> operation. The run_diff_files() uses ce_path_match() to match the
> pathspec against cache entries. However, it is called with NULL value
> for the seen parameter, which collects the pathspec matching
> information.

", which collects" -> ", which means we lose"

> Therefore, introduce a new parameter 'char *ps_matched' to 

"Therefore, introduce" -> "Introduce"

> add_files_to_cache() and in turn to run_diff_files(), to feed it to
> ce_path_match() to optionally collect the pathspec matching
> information. This will be helpful in reporting error in case of an
> untracked path being passed when the expectation is a known path. Thus,
> this will be used in the subsequent commits to fix 'commit -i' and 'add
> -u' not erroring out when given untracked paths.

A new parameter to run_diff_files() came as a bit of surprise to me.

When I responded to the previous round, I somehow thought that we'd
add a new member to the rev structure that points at an optional
.ps_matched member next to the existing .prune_data member.  

That way, it would hopefully be easy for a future code to see if a
"diff" invocation, not necessarily run_diff_files() that compares
the working tree and the index, consumed all the pathspec elements.
If such a new .ps_matched member is initialized to NULL, all the
patch noise we see in this patch will become unnecessary, no?

> diff --git a/diff-lib.c b/diff-lib.c
> index 5e8717c774..2dc3864abd 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -101,7 +101,8 @@ static int match_stat_with_submodule(struct diff_options *diffopt,
>  	return changed;
>  }
>  
> -void run_diff_files(struct rev_info *revs, unsigned int option)
> +void run_diff_files(struct rev_info *revs, char *ps_matched,
> +		    unsigned int option)
>  {
>  	int entries, i;
>  	int diff_unmerged_stage = revs->max_count;
> @@ -127,7 +128,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
>  		if (diff_can_quit_early(&revs->diffopt))
>  			break;
>  
> -		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
> +		if (!ce_path_match(istate, ce, &revs->prune_data, ps_matched))
>  			continue;
>  
>  		if (revs->diffopt.prefix &&

This may be a non-issue, but after this point we see the beginning
of another filter to reject paths outside the hierarchy "--relative"
specifies.  It is possible that a pathspec element matches ce->name
but the matched cache entry is outside the current area.  Shouldn't
we then consider that the pathspec element did not match?  E.g., in
our project, what should happen if we did this?

    $ echo >>diff.h
    $ cd t
    $ git diff --relative \*.h

The command should show nothing.  Did the pathspec '*.h' match?  From
those who know how the machinery works, yes it did before the resulting
paths were further filtered out, but from the end-user's point of view,
because "--relative" limits the diff to the current directory and below,
and because 't' and below did not have any C header files, wouldn't it
be more natural and useful to say the pathspec wasn't used?

This does not matter right now because we are not planning to add a
new "--error-unmatch" option to "git diff", but when/if we do, it
starts to matter.  The hunk at least needs a NEEDSWORK comment,
summarizing the above.

	/*
	 * NEEDSWORK:
	 * Here we filter with pathspec but the result is further
	 * filtered out when --relative is in effect.  To end-users,
         * a pathspec element that matched only to paths outside the
         * current directory is like not matching anything at all;
         * the handling of ps_matched[] here may become problematic
	 * if/when we add the "--error-unmatch" option to "git diff".
	 */ 

A solution to that problem might be just a matter of swapping the
order of filtering, but it may have performance implications and I'd
rather not have to worry about it right now in the context of the
current topic, hence a NEEDSWORK comment without attempting to "fix"
it would be the most preferred approach to such a side issue.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux