On Sun, Aug 02, 2020 at 10:46:52PM -0400, David Niklas wrote: > The file: Documentation/admin-guide/bug-hunting.rst > gives instructions on how to find the offending line of code using EIP. > My particular bug that I need to report doesn't have EIP listed -- at > least not by name. I tried to guess what EIP was in > my case "__schedule+0x29e/0x6c0" but that didn't produce any results in > gdb on my debug kernel. EIP is the x86-32 name for what x86-64 calls RIP. Not much help in this case, because it's pointing to a userspace address (actually, I think it's the vdso). > I don't appear to have the referenced module type problem: > "[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5" > > I'm totally new to kernel debugging so the documentation is really > important and I'm rather frustrated in even bothering to read it as it's > incomplete/unhelpful. > > Here's the exact bug I'm trying to tackle. > > [68812.480447] INFO: task CacheThread_Blo:9414 blocked for more than 480 > seconds. That's the important bit. Your task tried to take a mutex and 480 seconds later, it still didn't have it. > [68812.480459] Not tainted 4.14.184-nopreempt-AMDGPU-dav9 > #1 [68812.480464] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. [68812.480469] CacheThread_Blo D 0 9414 9082 > 0x00000080 [68812.480476] Call Trace: > [68812.480494] __schedule+0x29e/0x6c0 > [68812.480505] schedule+0x32/0x80 > [68812.480513] schedule_preempt_disabled+0xa/0x10 > [68812.480520] __mutex_lock.isra.1+0x26b/0x4e0 > [68812.480550] ? do_journal_begin_r+0xbe/0x390 [reiserfs] > [68812.480570] do_journal_begin_r+0xbe/0x390 [reiserfs] The mutex it tried to take was in the function do_journal_begin_r(). Best of luck debugging reiserfs problems these days. It doesn't get a lot of love.