Re: [PATCH v2] Performance optimization for detection of modified submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens Lehmann <Jens.Lehmann@xxxxxx> writes:

> - Changed the type of the new dirty_submodule flags to unsigned.

Why?  An unsigned used in place of a bool raises an eyebrow as it is more
common to use "int" (the most natural type on the platform).  Going
against tradition makes readers waste time wondering if there were some
other reason why the code couldn't use "int" and had to use "unsigned"
(e.g. "Hmmpphh, it looked like a mere boolean but the use of 'unsigned'
suggests there might be something deeper going on.  Is this used as a
bitfield?  Does this count and cannot go negative?").

> @@ -173,12 +174,16 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  			if (silent_on_removed)
>  				continue;
>  			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
> -				       ce->sha1, ce->name);
> +				       ce->sha1, ce->name, 0);

Removed from work tree; cannot be dirty---good.

> +		if (S_ISGITLINK(ce->ce_mode)
> +		    && (!changed || (revs->diffopt.output_format & DIFF_FORMAT_PATCH))
> +		    && is_submodule_modified(ce->name)) {
> +			changed = 1;
> +			dirty_submodule = 1;
> +		}

If the index records a submodule commit, and the commit checked out in the
submodule is different from that commit, ce_compare_gitlink() called from
ce_match_stat() would have already said "changed".  If we want "-dirty",
we need to call is_submodule_modified(), but otherwise we don't.  Looks
good.

Does DIFF_FORMAT_PATCH cover all cases where we may want a reliable value
in "dirty_submodule" here?  Should use of "--submodule=log" affect this
decision?

> @@ -188,7 +193,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  		newmode = ce_mode_from_stat(ce, st.st_mode);
>  		diff_change(&revs->diffopt, oldmode, newmode,
>  			    ce->sha1, (changed ? null_sha1 : ce->sha1),
> -			    ce->name);
> +			    ce->name, 0, dirty_submodule);

LHS is always index and RHS is work tree whose dirtiness is in
dirty_submodule; good.

> @@ -204,16 +209,18 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  static void diff_index_show_file(struct rev_info *revs,
>  				 const char *prefix,
>  				 struct cache_entry *ce,
> -				 const unsigned char *sha1, unsigned int mode)
> +				 const unsigned char *sha1, unsigned int mode,
> +				 unsigned dirty_submodule)
>  {
>  	diff_addremove(&revs->diffopt, prefix[0], mode,
> -		       sha1, ce->name);
> +		       sha1, ce->name, dirty_submodule);

Mental note to myself.  prefix[0] is either '-' (removed from the work
tree) or '+' (added to the work tree).  Added submodule could be
immediately dirty.

> @@ -251,15 +263,17 @@ static void show_new_file(struct rev_info *revs,
>  {
> ...
> -	diff_index_show_file(revs, "+", new, sha1, mode);
> +	diff_index_show_file(revs, "+", new, sha1, mode, dirty_submodule);

And this caller handles that case correctly; good.

> @@ -270,11 +284,13 @@ static int show_modified(struct rev_info *revs,
>  {
>  	unsigned int mode, oldmode;
>  	const unsigned char *sha1;
> +	unsigned dirty_submodule = 0;
>
> -	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
> +	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
> +			  &dirty_submodule, revs->diffopt.output_format) < 0) {
>  		if (report_missing)
>  			diff_index_show_file(revs, "-", old,
> -					     old->sha1, old->ce_mode);
> +					     old->sha1, old->ce_mode, 0);

Again, removed from work tree cannot be dirty; good.

> @@ -309,7 +325,7 @@ static int show_modified(struct rev_info *revs,
>  		return 0;
>
>  	diff_change(&revs->diffopt, oldmode, mode,
> -		    old->sha1, sha1, old->name);
> +		    old->sha1, sha1, old->name, 0, dirty_submodule);
>  	return 0;
>  }

Comparing between a tree entry and either an index entry or a file in the
work tree.  When cached (i.e. not looking at work tree), get_stat_data()
doesn't touch dirty_submodule, so this won't make noises about submodule
dirtyness, which is good.

> @@ -356,7 +372,7 @@ static void do_oneway_diff(struct unpack_trees_options *o,
>  	 * Something removed from the tree?
>  	 */
>  	if (!idx) {
> -		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
> +		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode, 0);

Removed from the work tree and canont be dirty; good.

> diff --git a/diff.c b/diff.c
> index 012b3d3..f130a36 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3736,7 +3736,7 @@ int diff_result_code(struct diff_options *opt, int status)
>  void diff_addremove(struct diff_options *options,
>  		    int addremove, unsigned mode,
>  		    const unsigned char *sha1,
> -		    const char *concatpath)
> +		    const char *concatpath, unsigned dirty_submodule)

A removed submodule cannot be dirty, so we can get away with only one
dirty_submodule that always refers to the RHS (i.e. "two"); I see.

> diff --git a/diffcore.h b/diffcore.h
> index 5b63458..66687c3 100644
> --- a/diffcore.h
> +++ b/diffcore.h
> @@ -42,6 +42,7 @@ struct diff_filespec {
>  #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
>  	unsigned should_free : 1; /* data should be free()'ed */
>  	unsigned should_munmap : 1; /* data should be munmap()'ed */
> +	unsigned dirty_submodule : 1;  /* For submodules: its work tree is dirty */

By the way, we might want to consolidate these bitfields into a single 

	unsigned should_free:1,
        	 should_munmap:1,
                 dirty_submodule:1;

in a separate clean-up patch, after all the dust settles.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]