Re: [PATCH v1] ext4: fix a assertion failure due to ungranted bh dirting

Jan Kara <jack@xxxxxxx> · Fri, 18 Oct 2024 11:14:44 +0200

On Fri 18-10-24 09:48:17, liubaolin wrote:
> > Hello, I am very sorry.
> > I did not previously understand the approach of your patch to solve the issue.
> > Yesterday, I intentionally injected faults during the quick reproduction
> > test, and indeed, after applying your patch, the crash issue was
> > resolved and did not occur again.
> > I finally understood your approach to solving the problem. Please disregard my previous email.
> > Thank you for helping me solve this crash issue in a better way.
> > I still need to improve my skills in file systems, and I truly appreciate your guidance.

Great! Thanks for testing. I'll send the patch for inclusion then.

								Honza

> 在 2024/10/16 21:38, liubaolin 写道:
> > > Hello,
> > > I reviewed the patch attached in your email. The issue you mentioned
> > > about clearing buffer_new(bh) in write_end_fn() is indeed a bug.
> > > However, this patch does not resolve the crash issue we encountered.
> > > 
> > > Let me explain my analysis in detail below.
> > > The crash occurs in the function jbd2_journal_dirty_metadata().
> > > 
> > > ext4_block_write_begin() -> ext4_journalled_zero_new_buffers() ->
> > > write_end_fn()
> > >  -> ext4_dirty_journalled_data() -> ext4_handle_dirty_metadata() ->
> > > __ext4_handle_dirty_metadata()
> > >  -> jbd2_journal_dirty_metadata()
> > > 
> > > In the function jbd2_journal_dirty_metadata(), there is the
> > > following condition:
> > > —---------------------------------------------------------------------------------------------------
> > >         if (data_race(jh->b_transaction != transaction &&
> > >             jh->b_next_transaction != transaction)) {
> > >                 spin_lock(&jh->b_state_lock);
> > >                 J_ASSERT_JH(jh, jh->b_transaction == transaction ||
> > >                                 jh->b_next_transaction == transaction);
> > >                 spin_unlock(&jh->b_state_lock);
> > >         }
> > > ----------------------------------------------------------------------------------------------------
> > > By analyzing the vmcore, I found that both jh->b_transaction and jh-
> > > >b_next_transaction are NULL.
> > > Through code analysis, I discovered that the
> > > __jbd2_journal_file_buffer() function adds the corresponding
> > > transaction of bh to jh->b_transaction.
> > > Normally, this is accessed through do_journal_get_write_access(),
> > > which can call __jbd2_journal_file_buffer().
> > > The detailed function call process is as follows:
> > > do_journal_get_write_access() -> ext4_journal_get_write_access() ->
> > > __ext4_journal_get_write_access()
> > >  -> jbd2_journal_get_write_access() -> do_get_write_access() ->
> > > __jbd2_journal_file_buffer()
> > > 
> > > 
> > > Therefore, resolving the crash issue requires obtaining write access
> > > before calling the jbd2_journal_dirty_metadata() function.
> > > The comment at the definition of the jbd2_journal_dirty_metadata()
> > > function also states:     'The buffer must have previously had
> > > jbd2_journal_get_write_access().'
> > > 
> > > In the ext4_block_write_begin() function, if get_block() encounters
> > > an error, then neither bh->b_this_page nor the subsequent bh calls
> > > do_journal_get_write_access().
> > > If bh->b_this_page and the subsequent bh are in the new state, it
> > > will lead to a crash when reaching the jbd2_journal_dirty_metadata()
> > > function.
> > > 
> > > So, there are two ways to resolve this crash issue:
> > > 1、Call do_journal_get_write_access() on bh that is not handled due
> > > to get_block() error.
> > >     The patch modification is in the attachment 0001-ext4-fix-a-
> > > assertion-failure-due-to-ungranted-bh-dir.patch.
> > > 
> > > 2、Call clear_buffer_new() on bh that is not handled due to
> > > get_block() error.
> > >     The patch modification is in the attachment 0001-ext4-fix-a-
> > > assertion-failure-due-to-bh-not-clear-new.patch.
> > > 
> > > Additionally, I have found a method to quickly reproduce this crash
> > > issue.
> > > For details, please refer to the email I previously sent you:
> > > “https://lore.kernel.org/all/bd41c24b-7325-4584-
> > > a965-392a32e32c74@xxxxxxx/”.
> > > I have verified that this quick reproduction method works for both
> > > solutions to resolve the issue.
> > > 
> > > Please continue to consider which method is better to resolve this
> > > issue. If you think that using clear_buffer_new() is a better
> > > solution, I can resend the patch via git send-mail.
> > 
> > 
> > 
> > 在 2024/10/16 18:33, Jan Kara 写道:
> > > Hello,
> > > 
> > > On Fri 11-10-24 12:08:58, Baolin Liu wrote:
> > > > Greetings，
> > > > 
> > > > This problem is reproduced by our customer using their own testing tool
> > > > “run_bug”. When I consulted with a client, the testing tool “run_bug”
> > > > used a variety of background programs to benchmark (including memory
> > > > pressure, cpu pressure, file cycle manipulation, fsstress Stress testing
> > > > tool, postmark program，and so on).
> > > > 
> > > > The recurrence probability is relatively low.
> > > 
> > > OK, thanks for asking!
> > > 
> > > > In response to your query, in ext4_block_write_begin, the new state will
> > > > be clear before get block, and the bh that failed get_block will not be
> > > > set to new. However, when the page size is greater than the
> > > > block size, a
> > > > page will contain multiple bh.
> > > 
> > > True. I wanted to argue that the buffer_new bit should be either
> > > cleared in
> > > ext4_block_write_begin() (in case of error) or in
> > > ext4_journalled_write_end() (in case of success) but actually
> > > ext4_journalled_write_end() misses the clearing. So I think the better
> > > solution is like the attached patch. I'll submit it once testing finishes
> > > but it would be great if you could test that it fixes your problems as
> > > well. Thanks!
> > > 
> > >                                 Honza
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR