Re: [PATCH 4/5] jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2023/6/2 0:31, Jan Kara 写道:
On Thu 01-06-23 22:20:38, Zhang Yi wrote:
On 2023/6/1 21:44, Zhihao Cheng wrote:
在 2023/6/1 17:41, Jan Kara 写道:

Hi, Jan
On Wed 31-05-23 19:50:59, Zhang Yi wrote:
From: Zhihao Cheng <chengzhihao1@xxxxxxxxxx>

Following process,

jbd2_journal_commit_transaction
// there are several dirty buffer heads in transaction->t_checkpoint_list
            P1                   wb_workfn
jbd2_log_do_checkpoint
   if (buffer_locked(bh)) // false
                              __block_write_full_page
                               trylock_buffer(bh)
                               test_clear_buffer_dirty(bh)
   if (!buffer_dirty(bh))
    __jbd2_journal_remove_checkpoint(jh)
     if (buffer_write_io_error(bh)) // false
                               >> bh IO error occurs <<
   jbd2_cleanup_journal_tail
    __jbd2_update_log_tail
     jbd2_write_superblock
     // The bh won't be replayed in next mount.
, which could corrupt the ext4 image, fetch a reproducer in [Link].

Since writeback process clears buffer dirty after locking buffer head,
we can fix it by checking buffer dirty firstly and then checking buffer
locked, the buffer head can be removed if it is neither dirty nor locked.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490
Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd")
Signed-off-by: Zhihao Cheng <chengzhihao1@xxxxxxxxxx>
Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx>

OK, the analysis is correct but I'm afraid the fix won't be that easy.  The
reordering of tests you did below doesn't really help because CPU or the
compiler are free to order the loads (and stores) in whatever way they
wish. You'd have to use memory barriers when reading and modifying bh flags
(although the modification side is implicitely handled by the bitlock
code) to make this work reliably. But that is IMHO too subtle for this
code.



I write two litmus-test files in tools/memory-model to check the memory module
of these two cases as jbd2_log_do_checkpoint() and __cp_buffer_busy() does.

<snip litmus tests>

So it looks like the out-of-order situation cannot happen, am I miss something?

After thinking about it a bit, indeed CPU cannot reorder the two loads
because they are from the same location in memory. Thanks for correcting me
on this. I'm not sure whether a C compiler still could not reorder the
tests - technically I suspect the C standard does not forbid this although
it would have to be really evil compiler to do this.

But still I think with the helper function I've suggested things are much
more obviously correct.

Thanks for suggestions, we will modify it in next iteration.



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux