Re: [PATCH 1/4] ext4: Fix deadlock during page writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 01-07-16 12:53:39, Ted Tso wrote:
> On Fri, Jul 01, 2016 at 11:09:50AM +0200, Jan Kara wrote:
> > But it is not safe - the bio contains pages, those pages have PageWriteback
> > set and if the inode is part of the running transaction,
> > ext4_journal_stop() will wait for transaction commit which will wait for
> > all outstanding writeback on the inode, which will deadlock on those pages
> > which are part of our unsubmitted bio. So the ordering really has to be the
> > way it is...
> 
> So to be clear. the issue is that PageWriteback won't get cleared
> until we potentially do a uninit->init conversion, and this is what
> requires taking a transaction handle leading to the other half of the
> deadlock?

No. It is even simpler:

ext4_writepages(inode == "foobar")
  prepares pages to write, sets PageWriteback
  ...
  mpage_map_and_submit_extent()
    // Writing data past i_size
    if (disksize > EXT4_I(inode)->i_disksize) {
      ...
      err2 = ext4_mark_inode_dirty(handle, inode);
        ext4_mark_iloc_dirty(handle, inode, &iloc);
          ext4_do_update_inode(handle, inode, iloc);
            // First file beyond 2 GB
            if (ei->i_disksize > 0x7fffffffULL) {
              if (!ext4_has_feature_large_file(sb) || ...)
                set_large_file = 1;
            }
            ...
            if (set_large_file) {
              ...
              ext4_handle_sync(handle);
              ...
            }
  ext4_journal_stop()
    jbd2_journal_stop(handle);
      ...
      if (handle->h_sync || ... ) {
        if (handle->h_sync && !(current->flags & PF_MEMALLOC))
          wait_for_commit = 1;
      if (wait_for_commit)
        err = jbd2_log_wait_commit(journal, tid);

So we are waiting for transaction commit to finish with unsubmitted pages
that already have PageWriteback set (and also potentially other pages that
are locked and we didn't prepare them for writing because the block mapping
we got was too short). Now JBD2 goes on trying to do the transaction
commit:

jbd2_journal_commit_transaction()
  ...
  journal_finish_inode_data_buffers()
    list_for_each_entry(jinode, &commit_transaction->t_inode_list, i_list) {
      ...
      err = filemap_fdatawait(jinode->i_vfs_inode->i_mapping);
      // And when inode "foobar" is part of this transaction's inode list, this
      // call is going to wait for PageWriteback bits on all the pages of
      // the inode to get cleared - which never happens because the IO was
      // not even submitted for them. The bio is just sitting prepared in
      // mpd.io_submit in ext4_writepages() and would be submitted once
      // ext4_journal_stop() completes.

Hope it is clearer now.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]