I have found a stack overflow at builtin/pack-objects.c:write_one(),
where it calls itself endlessly. This is caused by the object_entry e
and e->delta->delta being the same. But I have no idea how that happened.
First, the full story:
I used Google's repo tool to mirror AOSP to my machine. This mirrors
several kernel trees (six last time I counted), without sharing objects
one with another. To save space, I decided to point their
objects/info/alternates to my mirror of the Linus kernel tree (which
should be safe, since Linus makes it always fast-forward), and run "git
gc" on them to create a smaller pack. This worked for all trees except
one, where it core dumped (abrt report at
https://bugzilla.redhat.com/show_bug.cgi?id=755132).
I compiled the latest git (v1.7.8-rc3-17-gf56ef11) to see if it still
happened, and here is what I could get from gdb. I attached to the
pack-objects process before it crashed (full command line "git
pack-objects --keep-true-parents --honor-pack-keep --non-empty --all
--reflog --unpack-unreachable --local --delta-base-offset
/home/cesarb/src/bug755132/omap.git/objects/pack/.tmp-5171-pack"),
continued, and let it crash:
(gdb) cont
Continuing.
[New Thread 0x7f3f2bad3700 (LWP 5205)]
[New Thread 0x7f3f2b2d2700 (LWP 5206)]
[New Thread 0x7f3f2aad1700 (LWP 5207)]
[New Thread 0x7f3f2a2d0700 (LWP 5208)]
[Thread 0x7f3f2b2d2700 (LWP 5206) exited]
[Thread 0x7f3f2bad3700 (LWP 5205) exited]
[Thread 0x7f3f2aad1700 (LWP 5207) exited]
[Thread 0x7f3f2a2d0700 (LWP 5208) exited]
Program received signal SIGSEGV, Segmentation fault.
0x00000000004472b9 in write_one (f=0x6a97db0, e=0x7f3f30233490,
offset=0x7fff79b53908) at builtin/pack-objects.c:415
415 {
Unlike on Fedora's git binary, where it happened on a call instruction,
this time it happened on a push instruction:
(gdb) disassemble
Dump of assembler code for function write_one:
0x00000000004472b0 <+0>: push %r15
0x00000000004472b2 <+2>: push %r14
0x00000000004472b4 <+4>: push %r13
0x00000000004472b6 <+6>: mov %rdx,%r13
=> 0x00000000004472b9 <+9>: push %r12
0x00000000004472bb <+11>: mov %rdi,%r12
The last few frames on the stack show the endless recursion:
(gdb) where
#0 0x00000000004472b9 in write_one (f=0x6a97db0, e=0x7f3f30233490,
offset=0x7fff79b53908) at builtin/pack-objects.c:415
#1 0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30277390,
offset=0x7fff79b53908) at builtin/pack-objects.c:423
#2 0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30233490,
offset=0x7fff79b53908) at builtin/pack-objects.c:423
#3 0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30277390,
offset=0x7fff79b53908) at builtin/pack-objects.c:423
#4 0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30233490,
offset=0x7fff79b53908) at builtin/pack-objects.c:423
And here is the loop in the data structures:
(gdb) p e
$1 = (struct object_entry *) 0x7f3f30233490
(gdb) p e->delta
$2 = (struct object_entry *) 0x7f3f30277390
(gdb) p e->delta->delta
$3 = (struct object_entry *) 0x7f3f30233490
Unfortunately, I do not know enough of git's internals to debug further.
In case it helps, here is the contents of a few of these structures:
(gdb) p *e
$4 = {idx = {
sha1 = "\257>J\241)\266\023\064\a\342J\320\375ӆ\262M\245",
<incomplete sequence \356>, crc32 = 0, offset = 0}, size = 20, in_pack =
0x259b610,
in_pack_offset = 231061238, delta = 0x7f3f30277390,
delta_child = 0x7f3f30277390, delta_sibling = 0x7f3f30413b10,
delta_data = 0x0, delta_size = 20, z_delta_size = 0, hash = 2099915708,
type = OBJ_OFS_DELTA, in_pack_type = OBJ_OFS_DELTA,
in_pack_header_size = 5 '\005', preferred_base = 0 '\000',
no_try_delta = 0 '\000', tagged = 0 '\000', filled = 1 '\001'}
(gdb) p *(e->delta)
$5 = {idx = {
sha1 =
"\372\307\035\372\017\350\307\f\310R\t\236\006\034\063N*T\216\253",
crc32 = 0, offset = 0}, size = 14, in_pack = 0x259b610,
in_pack_offset = 39990, delta = 0x7f3f30233490,
delta_child = 0x7f3f30233490, delta_sibling = 0x0, delta_data = 0x0,
delta_size = 14, z_delta_size = 0, hash = 2099915708, type =
OBJ_REF_DELTA,
in_pack_type = OBJ_REF_DELTA, in_pack_header_size = 21 '\025',
preferred_base = 0 '\000', no_try_delta = 0 '\000', tagged = 0 '\000',
filled = 1 '\001'}
(gdb) p *(e->in_pack)
$6 = {next = 0x25a53c0, windows = 0x259bc40, pack_size = 449155894,
index_data = 0x7f3f4f0a9000, index_size = 58351420, num_objects =
2083941,
num_bad_objects = 0, bad_object_sha1 = 0x0, index_version = 2,
mtime = 1321387261, pack_fd = -1, pack_local = 1, pack_keep = 0,
do_not_close = 0, sha1 =
"\371Q4\177.ȳv\364\246\332Z\234\025?\352ݠP\210",
pack_name = 0x259b671
"/home/cesarb/src/bug755132/omap.git/objects/pack/pack-f951347f2ec8b376f4a6da5a9c153feadda05088.pack"}
I tried using "git fsck" to see if it could find anything strange, but
it seems to get stuck (using 100% CPU) after these lines:
[...]
Checking commit fb630b9fc902e24209166b1659a8b375bf38099c
Checking tree fc32c012c750084eb1d82782cee7c80a45a78289
Checking blob fc7bbba585cee2c2b0d5282c42fb986bfb032a0a
Checking commit fdcb23634c9b6649bb02c681033d4973491b0e35
Checking tree fe773cf73ff553249be2f24ddf770f5dc43a41f1
Checking blob fe67b5c79f0ff33d92ebe7469a89c5a5d044fc0a
Checking blob fe73276e026bf263f494a917c84c6a3fcaeaaeda
Checking tree fe30eda9d92d074816f9c3a47fd3ffb9b89ca835
Checking tree fe9c75396e6d433b289d0e40c7e47921b91cad3a
Checking blob ff3ed6086ce1c6b6b4b5111c034d14a208c0d045
Checking blob ff66638ff54d5ad7067e4f246d392059eef1a7bf
Checking tree ff126d2bc67017199049ddba761979f3bda57eb9
Unfortunately, the reproducer I have (a copy of both trees with
objects/info/alternates modified) is 1.8G in size, and I do not know how
to create a smaller reproducer. If you know of a command which would get
more relevant information from them, just ask; I plan on keeping them
around for a while.
--
Cesar Eduardo Barros
cesarb@xxxxxxxxxx
cesar.barros@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html