Re: [PATCH 2/2] Implement a simple delta_base cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 17 Mar 2007, Junio C Hamano wrote:

> When unpacking a depth-3 deltified object A, the code finds the
> target object A (which is a delta), ask for its base B and put B
> in the cache after using it to reconstitute A.  While doing so,
> the first-generation base B is also a delta so its base C (which
> is a non-delta) is found and placed in the cache.  When A is
> returned, the cache has B and C.  If you ask for B at this
> point, we read the delta, pick up its base C from the cache,
> apply, and return while putting C back in the cache.  If you ask
> for A after that, we do not read from the cache, although it is
> available.
> 
> Which feels a bit wasteful at first sight, and we *could* make
> read_packed_sha1() also steal from the cache, but after thinking
> about it a bit, I am not sure if it is worth it.  The contract
> between read_packed_sha1() and read_sha1_file() and its callers
> is that the returned data belongs to the caller and it is a
> responsibility for the caller to free the buffer, and also the
> caller is free to modify it, so stealing from the cache from
> that codepath means an extra allocation and memcpy.

So?

A malloc() + memcpy() will always be faster than mmap() + malloc() + 
inflate().  If the data is already there it is certainly better to copy 
it straight away.

With the patch below I can do 'git log drivers/scsi/ > /dev/null' about 
7% faster.  I bet it might be even more on those platforms with bad 
mmap() support.

Signed-off-by: Nicolas Pitre <nico@xxxxxxx>
---
diff --git a/sha1_file.c b/sha1_file.c
index a7e3a2a..ee64865 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1372,7 +1372,7 @@ static unsigned long pack_entry_hash(struct packed_git *p, off_t base_offset)
 }
 
 static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
-	unsigned long *base_size, enum object_type *type)
+	unsigned long *base_size, enum object_type *type, int keep_cache)
 {
 	void *ret;
 	unsigned long hash = pack_entry_hash(p, base_offset);
@@ -1384,7 +1384,13 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
 	return unpack_entry(p, base_offset, type, base_size);
 
 found_cache_entry:
-	ent->data = NULL;
+	if (!keep_cache)
+		ent->data = NULL;
+	else {
+		ret = xmalloc(ent->size + 1);
+		memcpy(ret, ent->data, ent->size);
+		((char *)ret)[ent->size] = 0;
+	}
 	*type = ent->type;
 	*base_size = ent->size;
 	return ret;
@@ -1418,7 +1424,7 @@ static void *unpack_delta_entry(struct packed_git *p,
 	off_t base_offset;
 
 	base_offset = get_delta_base(p, w_curs, &curpos, *type, obj_offset);
-	base = cache_or_unpack_entry(p, base_offset, &base_size, type);
+	base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0);
 	if (!base)
 		die("failed to read delta base object"
 		    " at %"PRIuMAX" from %s",
@@ -1615,7 +1621,7 @@ static void *read_packed_sha1(const unsigned char *sha1,
 	if (!find_pack_entry(sha1, &e, NULL))
 		return NULL;
 	else
-		return unpack_entry(e.p, e.offset, type, size);
+		return cache_or_unpack_entry(e.p, e.offset, size, type, 1);
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]