Re: GPF in index-pack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 6 Aug 2006 00:08:48 -0400 Shawn Pearce wrote:

> Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
> > On 8/5/06, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> > >Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
[...]
> > >> Why does resolve_delta in index-pack.c need to be recursive? Is there
> > >> a better way to code that routine? If it mmaps the file that uses 1GB
> > >> address space, why does it need another 1.5GB to build an index?
> > >
> > >Probably the easiest way to code the routine.  Delta depth is
> > >bounded; in the fast-import.c that I sent out last night I hardcoded
> > >it to 10, which is (I believe) the default for GIT.  So long as that
> > >routine is recursive only along a single delta chain the recursion
> > >depth won't be very high and shouldn't be the problem.
> > 
> > When I put index-pack in gdb at the seg fault, resolve_delta had
> > recursed more than 20,000 times. I stopped looking after that.
> 
> Ouch.  I'm not familiar with this code, but looking at right now its
> also not entirely obviously what its recursing for.  Plus dinner
> is trying to be burned on the grill, so my attention is on that
> more than on GIT.  :-)

git-pack-objects never creates a pack file with duplicate objects,
therefore git-index-pack was never tested on such pack files - no wonder
that it breaks.

The case of patch revert (A -> B -> A again) is probably the problem -
your dumb pack generator will probably write this:

	A
	B (delta based on A)
	A (delta based on B)

git-index-pack will first unpack the first copy of A, then notice that A
is used as a delta base for B and apply the delta, then it will find the
second copy of A and apply that delta, and then it will find B again...

Please try the patch at the end of this message - it should help to
avoid the infinite recursion in git-index-pack.  However, I'm not sure
that other git parts won't do something bad when they encounter an index
with duplicate sha1 entries (and git-index-pack cannot remove these
duplicates, because the number of index entries must match the pack
header).

> > >> I had a prior 400MB pack file built with fast-import that I was able
> > >> to index ok.
> > >
> > >Dumb luck?  Maybe that had no duplicates while this one does?
> > 
> > Is there a git command to list the sha1's in a pack that doesn't have
> > an index? I could sort it, sort it unqiue, and then diff the outputs.

git-index-pack :)

(git-unpack-objects will also have all sha1 values at the end, but the
side effect of unpacking all objects to separate files is probably not
what you want to get.)

> Not that I know of.  Packs themselves don't have the SHA1 values and
> getting them from a pack without an index is a painful exercise as
> you don't know where the base of an object resides within the pack
> when you need it to generate the object's raw content to determine
> its ID.

Yes, this is a problem.  git-unpack-objects can unpack all objects in a
single pass, but only because it temporarily saves all deltas which
cannot be resolved yet, and can read back objects which it has written
before.  git-index-pack needs to work without modifying the object
database, so it works in two passes:

	/*
	 * First pass:
	 * - find locations of all objects;
	 * - calculate SHA1 of all non-delta objects;
	 * - remember base SHA1 for all deltas.
	 */

	/*
	 * Second pass:
	 * - for all non-delta objects, look if it is used as a base for
	 *   deltas;
	 * - if used as a base, uncompress the object and apply all deltas,
	 *   recursively checking if the resulting object is used as a base
	 *   for some more deltas.
	 */

-----------------------------------------------------------------------

From: Sergey Vlasov <vsu@xxxxxxxxxxx>
Date: Sun, 6 Aug 2006 17:28:29 +0400
Subject: [PATCH] git-index-pack: Avoid infinite recursion if the pack has duplicate objects

Although git-pack-objects never creates packs which contain the same
object more than once, other tools may be not so careful, so add a check
to prevents infinite recursion of resolve_delta() for such packs.

Signed-off-by: Sergey Vlasov <vsu@xxxxxxxxxxx>
---
 index-pack.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/index-pack.c b/index-pack.c
index b39953d..a8e3b1f 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -257,6 +257,13 @@ static void resolve_delta(struct delta_e
 	unsigned long next_obj_offset;
 	int j, first, last;
 
+	/*
+	 * Do nothing if this delta was resolved earlier (this can happen if
+	 * the pack contains duplicate objects for some reason).
+	 */
+	if (obj->real_type != OBJ_DELTA)
+		return;
+
 	obj->real_type = type;
 	delta_data = unpack_raw_entry(obj->offset, &delta_type,
 				      &delta_size, base_sha1,
-- 
1.4.2.rc3.g23aa

Attachment: pgpbep3OdJ6iK.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]