Hi all, One of our mx servers started misbehaving today (postfix would timeout internally, load rising) and after I tried to reboot it, I got this: Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:613: "transaction->t_forget == NULL" ------------[ cut here ]------------ kernel BUG at fs/jbd/checkpoint.c:613! invalid operand: 0000 [#1] PREEMPT SMP Modules linked in: ipv6 aic79xx serverworks eepro100 sworks_agp agpgart floppy evdev pcspkr ohci_hcd usbcore e100 mii capability commoncap ide_cd ide_core cdrom rtc ext2 ext3 jbd mbcache sd_mod aic7xxx scsi_mod raid1 md unix font vesafb cfbcopyarea cfbimgblt cfbfillrect CPU: 0 EIP: 0060:[<f88a77a0>] Not tainted EFLAGS: 00010286 (2.6.8-1-686-smp) EIP is at __journal_drop_transaction+0x350/0x3f7 [jbd] eax: 00000071 ebx: c86f5620 ecx: c02d9fbc edx: c02d9fbc esi: f7af5800 edi: e9d5583c ebp: c86f5620 esp: f70e3d58 ds: 007b es: 007b ss: 0068 Process kjournald (pid: 523, threadinfo=f70e2000 task=f70a37f0) Stack: f88ac120 f88ab380 f88acefc 00000265 f88acf4c c86f5620 f7af5800 f88a731a f7af5800 c86f5620 dbc24db0 00000000 f88a66e6 e9d5583c e9d5583c f70e2000 f88a72c8 e9d5583c e9d5583c 000000e1 c86f55c0 d865c500 f70e2000 00000000 Call Trace: [<f88a731a>] __journal_remove_checkpoint+0x4a/0xa0 [jbd] [<f88a66e6>] __try_to_free_cp_buf+0x76/0xc0 [jbd] [<f88a72c8>] __journal_clean_checkpoint_list+0xa8/0xb0 [jbd] [<f88a4958>] journal_commit_transaction+0x2b8/0x1690 [jbd] [<c011e420>] autoremove_wake_function+0x0/0x60 [<c02347ca>] netif_receive_skb+0x1ba/0x230 [<c011e420>] autoremove_wake_function+0x0/0x60 [<c023450a>] net_tx_action+0x5a/0x160 [<c0119f78>] recalc_task_prio+0xa8/0x1a0 [<c029c9b7>] schedule+0x4b7/0x8a0 [<c01296da>] del_timer_sync+0x9a/0xe0 [<f88a8642>] kjournald+0xf2/0x2e0 [jbd] [<c011e420>] autoremove_wake_function+0x0/0x60 [<c011e420>] autoremove_wake_function+0x0/0x60 [<c01060d2>] ret_from_fork+0x6/0x14 [<f88a8530>] commit_timeout+0x0/0x10 [jbd] [<f88a8550>] kjournald+0x0/0x2e0 [jbd] [<c01042c5>] kernel_thread_helper+0x5/0x10 Code: 0f 0b 65 02 fc ce 8a f8 e9 6e fd ff ff 8d 76 00 c7 04 24 20 <6>note: kjournald[523] exited with preempt_count 3 The machine is still alive and kicking (and would not reboot even with reboot -f), most probably because the system is on one disk and postfix spool is on another one, that seems to be the cause of the problem. Relevant /etc/fstab line: /dev/sdc1 /var/spool/postfix ext3 rw,noatime,nodiratime,data=journal,commit=60 0 0 It also has a large journal, 400MB if I remember correctly. Debian Sarge. -- Jure Pečar http://jure.pecar.org/ _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users