Re: Can a metadata buffer end up in journal_unmap_buffer?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hello,

On Thu 11-08-11 09:32:22, Josef Bacik wrote:
> I have this weird bug that has been plaguing me for a while where
> t_outstanding_credits will end up less than t_nr_buffers.  I have done
> all sorts of things to try and catch when it happens but nothing seems
> to catch it.  At some point I had thought that we were screwing up in
> journal_unmap_buffer.  If a buffer is not on a transaction but is still
> a part of a checkpoint we will do a journal_file_buffer() onto the
> current running transaction's forget list.  The thing is we can still
> have b_modified set since we only clear it on
> do_get_write_access/journal_get_create_access if it isn't a part of the
> transaction yet.  So if we do the journal_file_buffer() before anybody
> calls do_get_write_access/journal_get_create_access we will short
> circuit these checks and b_modified will never be cleared and so when we
> do journal_dirty_metadata we won't account for the new buffer and it
> will end up inc'ing t_nr_buffers but not t_outstanding_credits.
  Good spotting!

> I had thought this was the problem before and put in a jh->b_modified =
> 0 in __dispose_buffer, but apparently the problem still happened.  But
> that support person/customer were not entirely reliable so I'm back to
> thinking this is what happened and they just didn't run with my patch.
  Umm, I think there's one more way how buffer b_modified == 1 can get
to other transaction's forget list. In journal_unmap_buffer(), transaction
== journal->j_committing_transaction case we do set_buffer_freed() and
set b_next_transaction to the running transaction. So when the currently
committing transaction finishes, it refiles the buffer to BJ_Forget list
of the running transaction. b_modified handling seems to be really fragile
in this regard. I guess the rule is that whenever we are going to change
b_transaction or b_next_transaction, we should clear b_modified.

> The problem is I don't think we can call journal_unmap_buffer() on just
> a normal metadata block (that is with data=ordered), it only gets called
> by ext3_invalidatepage() which is only called on pages on the inodes
> address space, so not metadata.  However, Jan had a patch to delay the
> free'ing of buffers for orphan reasons, with commit
> 
> 86963918965eb8fe0c8ae009e7c1b4c630f533d5
> 
> which makes it seem like metadata can come through
> journal_unmap_buffer()?  Does anybody know for sure one way or the
> other?  And if you happen to have a theory on the actual problem itself
> I would _love_ to hear it :).  Thanks,
  It can happen either in data=journal mode or for long symlinks. BTW I
think Kyle Moffet was probably hitting this bug with ext4
(http://lists.debian.org/debian-kernel/2011/04/msg00145.html) and I was
also trying to find the culprit for some time without success so I'm glad
you found it in the end ;).

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux