Actually the previous error I reported was on a 2.4.16 kernel. upgrading to 2.4.18 made it stop hanging the system and got us to the real error message: kernel: journal_bmap: journal block not found at offset 6924 on sd(8,2) kernel: Aborting journal on device sd(8,2). kernel: ext3_abort called. kernel: EXT3-fs abort (device sd(8,2)): ext3_journal_start: Detected aborted journal kernel: Remounting filesystem read-only any idea how we can recover from here? is there a way to rebuild an ext3 journal? > > > Getting this oops on one of our production servers > pretty much hangs the server. > Do we have a corrupted Journal? how would ewe rebuild it? > Any idea how to recover from it? > > > Assertion failure in journal_bmap() at journal.c:636: "ret != 0" > invalid operand: 0000 > CPU: 0 > EIP: 0010:[journal_bmap+70/96] Not tainted > EIP: 0010:[<c016b646>] Not tainted > EFLAGS: 00010282 > eax: 00000044 ebx: 00000000 ecx: 00000002 edx: f7121f64 > esi: d93e88e0 edi: f122f700 ebp: f7b58e00 esp: f7a99e50 > ds: 0018 es: 0018 ss: 0018 > Process kjournald (pid: 14, stackpage=f7a99000) > Stack: c0352a20 c034e401 c034e3a5 0000027c c034e3f8 f7b58e00 c016b5f7 > f7b58e00 > 00001b0c d93e88e0 c0168c8d f7b58e00 f7b58ee4 00000000 00000fdc dc757024 > 00000002 db247c60 f28df240 f122f730 00000001 00000070 00000001 e2bc9ae0 > Call Trace: [journal_next_log_block+103/112] > [journal_commit_transaction+1661/3856] [do_softirq+123/224] > [do_IRQ+221/240] [schedule+1113/1296] > Call Trace: [<c016b5f7>] [<c0168c8d>] [<c011cdcb>] [<c01089bd>] > [<c0115e59>] > [kjournald+310/464] [commit_timeout+0/16] [kernel_thread+38/48] > [kjournald+0/464] > [<c016aff6>] [<c016aea0>] [<c0105616>] [<c016aec0>] > > Code: 0f 0b 83 c4 14 eb 05 8d 76 00 89 c3 89 d8 5b c3 8d 76 00 8d > > > > -- Martial Herbaut --------------- Server101 Fast and Reliable Hosting! http://www.server101.com/