Re: [PATCH v3 3/5] refs: teach for_each_ref a flag to avoid recursion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/08/2014 04:47 AM, Jeff King wrote:
> On Tue, Jan 07, 2014 at 06:58:50PM -0500, Jeff King wrote:
> 
>> +			if (flags & DO_FOR_EACH_NO_RECURSE) {
>> +				struct ref_dir *subdir = get_ref_dir(entry);
>> +				sort_ref_dir(subdir);
>> +				retval = do_for_each_entry_in_dir(subdir, 0,
> 
> Obviously this is totally wrong and inverts the point of the flag. And
> causes something like half of the test suite to fail.
> 
> Michael was nice enough to point it out to me off-list, but well, I have
> to face the brown paper bag at some point. :) In my defense, it was a
> last minute refactor before going to dinner. That is what I get for
> rushing out the series.
> 
> Here's a fixed version of patch 3/5.

v2 4/5 doesn't apply cleanly on top of v3 3/5.  So I'm basing my review
on the branch you have at GitHub peff/git "jk/cat-file-warn-ambiguous";
I hope it is the same.

> -- >8 --
> Subject: refs: teach for_each_ref a flag to avoid recursion
> 
> The normal for_each_ref traversal descends into

You haven't changed any for_each_ref*() functions; you have only exposed
the DO_FOR_EACH_NO_RECURSE option to the (static) functions
for_each_entry*() and do_for_each_ref().  (This is part and parcel of
your decision not to expose the new functionality in the refs API.)
Please correct the line above.

> subdirectories, returning each ref it finds. However, in
> some cases we may want to just iterate over the top-level of
> a certain part of the tree.
> 
> The introduction of the "flags" option is a little
> mysterious. We already have a "flags" option that gets stuck
> in a callback struct and ends up interpreted in do_one_ref.
> But the traversal itself does not currently have any flags,
> and it needs to know about this new flag.
> 
> We _could_ introduce this as a completely separate flag
> parameter. But instead, we simply put both flag types into a
> single namespace, and make it available at both sites. This
> is simple, and given that we do not have a proliferation of
> flags (we have had exactly one until now), it is probably
> sufficient.
> 
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
>  refs.c | 61 ++++++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 38 insertions(+), 23 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 3926136..b70b018 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -589,6 +589,8 @@ static void sort_ref_dir(struct ref_dir *dir)
>  
>  /* Include broken references in a do_for_each_ref*() iteration: */
>  #define DO_FOR_EACH_INCLUDE_BROKEN 0x01
> +/* Do not recurse into subdirs, just iterate at a single level. */
> +#define DO_FOR_EACH_NO_RECURSE     0x02
>  
>  /*
>   * Return true iff the reference described by entry can be resolved to
> @@ -661,7 +663,8 @@ static int do_one_ref(struct ref_entry *entry, void *cb_data)
>   * called for all references, including broken ones.
>   */
>  static int do_for_each_entry_in_dir(struct ref_dir *dir, int offset,
> -				    each_ref_entry_fn fn, void *cb_data)
> +				    each_ref_entry_fn fn, void *cb_data,
> +				    int flags)
>  {
>  	int i;
>  	assert(dir->sorted == dir->nr);

Please update the docstring for this function, which still says that it
recurses without mentioning DO_FOR_EACH_NO_RECURSE.

> [...]
> @@ -817,7 +830,7 @@ static int is_refname_available(const char *refname, const char *oldrefname,
>  	data.conflicting_refname = NULL;
>  
>  	sort_ref_dir(dir);
> -	if (do_for_each_entry_in_dir(dir, 0, name_conflict_fn, &data)) {
> +	if (do_for_each_entry_in_dir(dir, 0, name_conflict_fn, &data, 0)) {
>  		error("'%s' exists; cannot create '%s'",
>  		      data.conflicting_refname, refname);
>  		return 0;
> @@ -1651,7 +1664,8 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
>   * 0.
>   */
>  static int do_for_each_entry(struct ref_cache *refs, const char *base,
> -			     each_ref_entry_fn fn, void *cb_data)
> +			     each_ref_entry_fn fn, void *cb_data,
> +			     int flags)
>  {
>  	struct packed_ref_cache *packed_ref_cache;
>  	struct ref_dir *loose_dir;

A few lines after this, do_for_each_entry() calls
prime_ref_dir(loose_dir) to ensure that all of the loose references that
will be iterated over are read before the packed-refs file is checked.
It seems to me that prime_ref_dir() should also get a flags parameter to
prevent it reading more loose references than necessary, something like
this:

====================================================================
diff --git a/refs.c b/refs.c
index b70b018..b8b7354 100644
--- a/refs.c
+++ b/refs.c
@@ -772,13 +772,13 @@ static int do_for_each_entry_in_dirs(struct
ref_dir *dir1,
  * through all of the sub-directories. We do not even need to care about
  * sorting, as traversal order does not matter to us.
  */
-static void prime_ref_dir(struct ref_dir *dir)
+static void prime_ref_dir(struct ref_dir *dir, int flags)
 {
 	int i;
 	for (i = 0; i < dir->nr; i++) {
 		struct ref_entry *entry = dir->entries[i];
-		if (entry->flag & REF_DIR)
-			prime_ref_dir(get_ref_dir(entry));
+		if (entry->flag & REF_DIR && !(flags & DO_FOR_EACH_NO_RECURSE))
+			prime_ref_dir(get_ref_dir(entry), flags);
 	}
 }
 /*
@@ -1685,7 +1685,7 @@ static int do_for_each_entry(struct ref_cache
*refs, const char *base,
 		loose_dir = find_containing_dir(loose_dir, base, 0);
 	}
 	if (loose_dir)
-		prime_ref_dir(loose_dir);
+		prime_ref_dir(loose_dir, flags);

 	packed_ref_cache = get_packed_ref_cache(refs);
 	acquire_packed_ref_cache(packed_ref_cache);

====================================================================

> [...]
> @@ -1718,7 +1732,7 @@ static int do_for_each_ref(struct ref_cache *refs, const char *base,
>  	data.fn = fn;
>  	data.cb_data = cb_data;
>  
> -	return do_for_each_entry(refs, base, do_one_ref, &data);
> +	return do_for_each_entry(refs, base, do_one_ref, &data, flags);
>  }
>  
>  static int do_head_ref(const char *submodule, each_ref_fn fn, void *cb_data)

This change makes the DO_FOR_EACH_NO_RECURSE option usable with
do_for_each_ref() (even though it is never in fact used).  It should
either be mentioned in the docstring or (if there is a reason not to
allow it) explicitly prohibited.

> [...]

The rest looks fine to me.

It would be possible to use your new flag to speed up
is_refname_available(), but it would be a little bit of work and I doubt
that is_refname_available() is ever a bottleneck.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]