Hi, On Sat, Nov 23, 2002 at 08:18:26PM +0000, Jeff Schaller wrote: > Boy, it'd help if I actually attached the log, eh? Yes. :) Content-Description: kern.log > Nov 19 06:58:47 debian kernel: attempt to access beyond end of device > Nov 19 06:58:47 debian kernel: 09:00: rw=1, want=39121480, limit=39121408 Corruption on disk. Undiagnosable without more info --- it could be hardware or software. > Nov 19 06:58:47 debian kernel: Assertion failure in __journal_remove_journal_head() at journal.c:1732: "buffer_jbd(bh)" > Nov 19 06:58:47 debian kernel: kernel BUG at journal.c:1732! That's a core driver layer bug which we found recently, but it's too close to 2.4.20 to include the fix. Basically, on an out-of-bounds IO the ll_rw_block code was clearing most of the bits in the buffer_head state, leading to the above assert failure when ext3 found its critical metadata had been corrupted. Patch is below, I'll send it to Marcelo for 2.4.21-pre. --Stephen
--- linux-2.4-ext3merge/drivers/block/ll_rw_blk.c.=K0026=.orig Mon Nov 25 15:03:18 2002 +++ linux-2.4-ext3merge/drivers/block/ll_rw_blk.c Mon Nov 25 15:04:00 2002 @@ -1129,7 +1129,7 @@ if (maxsector < count || maxsector - count < sector) { /* Yecch */ - bh->b_state &= (1 << BH_Lock) | (1 << BH_Mapped); + bh->b_state &= ~(1 << BH_Dirty); /* This may well happen - the kernel calls bread() without checking the size of the device, e.g., @@ -1140,7 +1140,6 @@ kdevname(bh->b_rdev), rw, (sector + count)>>1, minorsize); - /* Yecch again */ bh->b_end_io(bh, 0); return; }