Seiji Kihara <kihara.seiji@xxxxxxxxxxxxx> wrote:
We found that fsync and fdatasync syscalls sometimes don't sync data in an ext3 file system under the following conditions.
1. Kernel version is 2.6.6 or later (including 2.6.8.1 and 2.6.9-rc2). 2. Ext3's journalling mode is "data=journal".
The problem was occurred since 2.6.5-bk1, which includes the patch "[PATCH] ext3 fsync() and fdatasync() speedup". We found that the problem was solved by deleting the part of the patch which modifies ext3_sync_file(). Maybe, i_state is not correctly set to I_DIRTY when the related page cache is dirty (is it true?)
I have a few qmail (about the heaviest fsync using mta software around) boxes that have their queues on ext3.
On a 2.6.7 kernel, these guys are guaranteed to crash within hours if I used data=journal for the fs on which the qmail queues are. I say this because I ran two of them with data=journal mode and they crashed once or more a day. Another one which stayed with ordered had no problems during the same period.
Going back to ordered meant that they ran stable for days (weeks now).
The only thing I could get from the logs is:
---------------------------
Aug 17 05:58:22 mta1-7 kernel: Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:613: "transaction->t
_forget == NULL"
Aug 17 05:58:22 mta1-7 kernel: ------------[ cut here ]------------
Aug 17 05:58:22 mta1-7 kernel: kernel BUG at fs/jbd/checkpoint.c:613!
Aug 17 05:58:22 mta1-7 kernel: invalid operand: 0000 [#1]
Aug 17 05:58:22 mta1-7 kernel: SMP
Aug 17 05:58:22 mta1-7 kernel: Modules linked in: nfs lockd sunrpc e1000 e100 mii usbcore
Aug 17 05:58:22 mta1-7 kernel: CPU: 0
Aug 17 05:58:22 mta1-7 kernel: EIP: 0060:[<c0193db0>] Not tainted
Aug 17 05:58:22 mta1-7 kernel: EFLAGS: 00010202 (2.6.7)
_______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users