Neil Brown <neilb@cse.unsw.edu.au> wrote: > > > Could have been an IO error, or the block/MD/device layer returned > > incorrect data. ext3 used to go BUG a lot in the latter case, but nowadays > > we try to abort the journal and go read-only. > > > > Without the initial message we do not know. > > Can I add a "me too"..... No. Go away. > First, I'm using data=journal - is that supposed to work in 2.6 yet? > I think so. It's much less tested than ordered mode, but some people have beat upon it. > I have a raid5 array across a bunch of SCSI drives and a separate scsi > drive with boot, swap, and a journal partition. > I have an ext3 filesystem on the raid5 array with an external journal > on the journal partition. oh. Good to hear that external journals still work. > The raid5 was rebuilding a spare and I was pounding the filesystem > over NFS using the SPEC SFS benchmark program (ofcourse the raid5 > rebuild killed the performance reported by SFS, but I expected that. > > Shortly after the rebuild finished, I got an ext3 error (see log > below) and the journal aborted, and then nfsd Oopsed inside ext3. > ... > Aug 6 15:22:05 adams kernel: EXT3-fs error (device md1): ext3_add_entry: bad entry in directory #41 > 009295: rec_len is smaller than minimal - offset=0, inode=3265411686, rec_len=0, name_len=0 It looks like we had a block full of zeroes come back from the device driver. I find it distinctly fishy how this happens so much with ext3-on-md, and so little with ext3-on-just-a-disk. > Aug 6 15:22:05 adams kernel: Remounting filesystem read-only > Aug 6 15:22:05 adams kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Now that's an ext3 bug. Something like this... fs/jbd/transaction.c | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-) diff -puN fs/jbd/transaction.c~ext3-aborted-journal-fix fs/jbd/transaction.c --- 25/fs/jbd/transaction.c~ext3-aborted-journal-fix 2003-08-05 23:53:16.000000000 -0700 +++ 25-akpm/fs/jbd/transaction.c 2003-08-05 23:56:47.000000000 -0700 @@ -525,12 +525,18 @@ do_get_write_access(handle_t *handle, st int force_copy, int *credits) { struct buffer_head *bh; - transaction_t *transaction = handle->h_transaction; - journal_t *journal = transaction->t_journal; + transaction_t *transaction; + journal_t *journal; int error; char *frozen_buffer = NULL; int need_copy = 0; + if (is_handle_aborted(handle)) + return -EROFS; + + transaction = handle->h_transaction; + journal = transaction->t_journal; + jbd_debug(5, "buffer_head %p, force_copy %d\n", jh, force_copy); JBUFFER_TRACE(jh, "entry"); _ _______________________________________________ Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users