Re: [PATCH 1/3] diff_tree_sha1(): avoid rereading trees if possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> If the tree has already been read, no need to read it into memory
> again.

Hmm...

> This also helps when this function is called on temporary trees;
> these no longer have to be written to disk.

To generate tree-diff, probably yes, but I am not sure that
allows you to use hash_sha1_file() everywhere in merge_recursive
instead of write_sha1_file(), if that is what you are getting
at.

> +static int get_tree_desc_from_sha1(const unsigned char *sha1,
> +		struct tree_desc *t)
> +{
> +	struct object *o;
> +
> +	o = lookup_object(sha1);
> +	if (o && o->type == OBJ_TREE && o->parsed) {
> +		struct tree *tree = (struct tree *)o;
> +		t->size = tree->size;
> +		t->buf = xmalloc(t->size);
> +		memcpy(t->buf, tree->buffer, t->size);
> +	} else {
> +		t->buf = read_object_with_reference(sha1,
> +				tree_type, &t->size, NULL);
> +		if (!t->buf)
> +			die("unable to read source tree (%s)",
> +					sha1_to_hex(sha1));
> +	}
> +}

Are you absolutely sure that all users of "struct tree" retains
the tree->buffer for a parsed tree?  If nobody does
"free(tree->buffer)" without "tree->buffer = NULL", then the
situation is still salvageable (you need a bit more code above),
though.

I think this is overkill that only helps a very narrow "empty
tree" special case that [PATCH 2/3] addresses, and can be easily
and incorrectly abused.  We do not want people to expect that
reading many trees from different revisions as "struct tree"
objects and keeping all of them in memory would magically speed
up diff-tree, for example.

I'd prefer write_sha1_file() approach in Shawn's patch for its
simplicity at least for now.

I suspect gitlink/subproject people might want to modify in-core
representation of a tree to graft in subproject directory
somewhere in the superproject, just like the history traversal
code modifies in-core representation of a commit to simplify
parents.  Your approach might turn out to be the right thing to
for that application --- populate tree objects in the in-core
obj_hash[], muck with its entries and then have everybody else
go through get_tree_desc_from_sha1() interface to pretend as if
the superproject has everything contained in the subproject
tree.  I dunno.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]