Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 05, 2014 at 02:58:36PM -0800, Junio C Hamano wrote:
> Kirill Smelkov <kirr@xxxxxxxxxxxxxx> writes:
> 
> > On Wed, Feb 05, 2014 at 11:42:41AM -0800, Junio C Hamano wrote:
> >> Kirill Smelkov <kirr@xxxxxxxxxxxxxx> writes:
> >> 
> >> > I agree object data should be immutable for good. The only thing I'm talking
> >> > about here is mode, which is parsed from a tree buffer and is stored in
> >> > separate field:
> >> 
> >> Ah, I do not see any problem in that case, then.
> >> 
> >> Thanks.
> >
> > Thanks, that simplifies things for me.

Below is a patch which does it. Please apply, if it is ok.


> Surely.
> 
> Be careful when traversing N-trees in parallel---you may have to
> watch out for the entry ordering rules that sorts the following
> paths in the order shown:
> 
> 	a
> 	a-b
>         a/b
>         a_b
> 
> Inside a single tree, you cannot have 'a' and 'a/b' at the same
> time, but one tree may have 'a' (without 'a/b') while another one
> may have 'a/b' (without 'a'), and walking them in parallel has
> historically been one source of funny bugs.

Thanks for the warning. I'm relying here on base_name_compare() and
ordering it imposes on entries and how it differentiates directories and
files, so let's hope this should be ok.

Actual reworking is still in flux though...

Thanks,
Kirill


---- 8< ----
From: Kirill Smelkov <kirr@xxxxxxxxxx>
Date: Thu, 6 Feb 2014 15:36:31 +0400
Subject: [PATCH] Finally switch over tree descriptors to contain a pre-parsed entry

This continues 4651ece8 (Switch over tree descriptors to contain a
pre-parsed entry) and moves the only rest computational part

    mode = canon_mode(mode)

from tree_entry_extract() to tree entry decode phase - to
decode_tree_entry().

The reason to do it, is that canon_mode() is at least 2 conditional
jumps for regular files, and that could be noticeable should canon_mode()
be invoked several times.

That does not matter for current Git codebase, where typical tree
traversal is

    while (t->size) {
        sha1 = tree_entry_extract(t, &path, &mode);
        ...
        update_tree_entry(t);
    }

i.e. we do t -> sha1,path.mode "extraction" only once per entry. In such
cases, it does not matter performance-wise, where that mode
canonicalization is done - either once in tree_entry_extract(), or once
in decode_tree_entry() called by update_tree_entry() - it is
approximately the same.

But for future code, which could need to work with several tree_desc's
in parallel, it could be handy to operate on tree_desc descriptors, and
do "extracts" only when needed, or at all, access only relevant part of
it through structure fields directly.

And for such situations, having canon_mode() be done once in decode
phase is better - we won't need to pay the performance price of 2 extra
conditional jumps on every t->mode access.

So let's move mode canonicalization to decode_tree_entry(). That was the
final bit. Now after tree entry is decoded, it is fully ready and could
be accessed either directly via field, or through tree_entry_extract()
which this time got really "totally trivial".

Signed-off-by: Kirill Smelkov <kirr@xxxxxxxxxx>
---
 tree-walk.c | 2 +-
 tree-walk.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 79dba1d..4dc86c7 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -37,7 +37,7 @@ static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned
 
 	/* Initialize the descriptor entry */
 	desc->entry.path = path;
-	desc->entry.mode = mode;
+	desc->entry.mode = canon_mode(mode);
 	desc->entry.sha1 = (const unsigned char *)(path + len);
 }
 
diff --git a/tree-walk.h b/tree-walk.h
index ae04b64..ae7fb3a 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -16,7 +16,7 @@ struct tree_desc {
 static inline const unsigned char *tree_entry_extract(struct tree_desc *desc, const char **pathp, unsigned int *modep)
 {
 	*pathp = desc->entry.path;
-	*modep = canon_mode(desc->entry.mode);
+	*modep = desc->entry.mode;
 	return desc->entry.sha1;
 }
 
-- 
1.9.rc1.181.g641f458

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]