Comment # 3
on bug 99488
from Jan Vesely
(In reply to nixscripter from comment #1) > I'm still trying some versions in order to help you guys pin this down (it's > not always easy to tell what reinstall is having what effect, since Arch > Linux has three packages involved). In the mean time, I did the basics on > the process in its hung state. > > It's currently running three threads, two blocked, one continuing to run: > > (gdb) info threads > Id Target Id Frame > * 1 Thread 0x39ac9cdf7c0 (LWP 3806) "display" 0x0000039abefef921 in > llvm::MachineInstr::findRegisterDefOperandIdx(unsigned int, bool, bool, > llvm::TargetRegisterInfo const*) const () from /usr/lib/libLLVM-5.0svn.so can you get backtrace of this thread? does it ever leave this function? you can check by adding breakpoint on that function and checking if it gets hit. this can be repeated going up the stack to find the function that won't exit. > 2 Thread 0x39abd04f700 (LWP 3809) "radeon_cs:0" 0x0000039ac6b0310f in > pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0 > 3 Thread 0x39abadd4700 (LWP 3814) "display" futex_wait (val=8, > addr=0x25349d4) > at /build/gcc-multilib/src/gcc/libgomp/config/linux/x86/futex.h:44 > (gdb) > > > What is that call to findRegisterDefOperandIdx doing? there's a loop, it can't be infinite, but if the num of operands is corrupted, it can take a very long time to finish. can you check "p e" in gdb? > It's not entirely > clear, but it's sucking up a lot of memory. Running strace confirms that: > > strace: Process 3806 attached with 3 threads > strace: [ Process PID=3806 runs in x32 mode. ] > [pid 3809] futex(0x2599e64, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> > [pid 3814] futex(0x25349d4, FUTEX_WAIT_PRIVATE, 8, NULL <unfinished ...> > [pid 3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x640f4000 > strace: [ Process PID=3806 runs in 64 bit mode. ] > [pid 3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a638f3000 > [pid 3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a630f2000 > [pid 3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a628f1000 > [...] > > And down the address space it goes, 0x1000 bytes (4k) a time or two per > second. the above mmaps show 8M (+4K, probably for bookkeeping) allocations. is there any other, not shown? I haven't found anything in the mentioned function that would need such big amount of memory, the hand if probably higher in the call stack. > > Looking at the function name, I'm thinking about what Jan said on another > bug: > > > the hang is probably a separate bug. ImageMagick test suite results on my Turks GPU are: > > # TOTAL: 86 > > # PASS: 78 > > # SKIP: 0 > > # XFAIL: 0 > > # FAIL: 3 > > # XPASS: 0 > > # ERROR: 5 > > > > the errors and failures are accompanied by: > > Assertion `i < getNumRegs() && "Register number out of range!"' failed. > > Could this be perhaps the same registers that were out of range on a > different card? all cards of one class have the same number of architecturally available registers. I see you have debug symbols, is that a debug build? if not, it can be that the assert is not hit, and the hang is just fallout. > > Either way, I will continue to investigate, and hope to narrow down the > issue soon. thanks.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel