Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 10 Dec 2008, R. Tyler Ballance wrote:
> 
> I decided to endure the 30 minutes long this took on machine, and ran
> the operation in gdb. As a result, I got the SIGSEGV again, and a 13MB
> stacktrace.
> 
> In fact, the stack trace was probably longer, but this happened while I
> printed out `bt full`:

Wow. You even got _gdb_ to segfault.

You're my hero. If it can break, you will do it.

> I think I'm going to need to have a drink :-/

Have one for me too.

Anyway, that's a really annoying problem, and it's a bug in git. 
Admittedly it's probably brought on by you having a fairly small stack 
ulimit, which is also what likely brought gdb to its knees.

That stupid fsck commit walker walks the parents recursively. That's 
horribly bogus. So you have a recursion that goes from the top-level 
commit all the way to the root, doing

	fsck_walk_commit -> walk(parent) -> fsck_walk-commit -> ..

and you have a fairly deep commit tree. 

When it hits parent number 80,000+, you run out of stack space, and 
SIGSEGV. And judging by the fact that gdb also SIGSEGV's for you when 
doing the backtrace, it looks like the gdb backtrace tracer is _also_ 
recursive, and _also_ hits the same issue ;)

Anyway, with a 8M stack-size I can fsck the kernel repo without any 
problem, but while the kernel repo has something like 120k commits in it, 
it's a very "bushy" repository (lots of parallelism and merges), and the 
path from the top parent to the root is actually much shorter, at just 27k 
commits.

I take it that your project has a very long and linear history, which is 
why you have a long path from your HEAD to your root.

(You can do something like

	git rev-list --first-parent HEAD | wc -l

to get the depth of your history when just walking the first parent, and 
if I'm right you'll have a number that is bgger then 80k.)

So you have definitely found a real bug. Right now, you should be able to 
work around it by just making your stack depth rather bigger. The 
recursion is not very complicated, so even though it's 80,000 deep, each 
entry probably is about a hundred bytes on the stack. 

In fact, if you're on Linux, most default stack depths would be 8 MB, 
which would roughly match that "80k entries of 100 bytes each".

But we should definitely fix this braindamage in fsck. Rather than 
recursively walk the commits, we should add them to a commit list and just 
walk the list iteratively.

Junio?

		Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux