[RFC] call end_page_writeback after converting unwritten extents in ext4_end_io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

Now I am trying to handle AIO DIO with O_SYNC using extent status tree in ext4.
After applied Christoph's patch series, O_SYNC semantics in ext4 will be broken.
This problem can be fixed using extent status tree.  But we will get a deadlock
because i_mutex needs to be taken in ext4_sync_file() and then it will wait on
i_unwritten==0.  So let's consider what happends after applied Christoph's
patches and using extent status tree to ensure AIO DIO with O_SYNC semantics.

  ext4_ext_direct_IO:              ext4_ind_direct_IO:
                                   ->ext4_file_write()
                                     ->mutex_lock(i_mutex)
                                       ->ext4_ind_direct_IO()
                                         [if this is an append dio]
                                     ->mutex_unlock(i_mutex)
  ->ext4_file_write()
    ->mutex_lock(i_mutex)
    ->ext4_ext_direct_IO()
    ->mutex_unlock(i_mutex)
                                     ->generic_write_sync()
                                       ->ext4_sync_file()
                                         ->mutex_lock(i_mutex)
                                         ->ext4_flush_unwritten_io()
                                           ->ext4_do_flush_complete_IO()
                                             [there is empty list]
                                           ->ext4_unwritten_wait()
                                             [wait on i_unwritten==0 because
                                              in ext4_ext_direct_IO i_unwritten
                                              has been increased]
  kworkd:
  ->dio_complete()
    ->ext4_end_dio()
      ->ext4_es_convert_unwritten_extents()
        [convert unwritten extents in status
         tree to ensure O_SYNC semantics]
      ->ext4_add_complete_io()
    ->generic_write_sync()
      ->ext4_sync_file()
        ->mutex_lock(i_mutex)
          [*DEADLOCK*]

Thus all we need to do is do not wait on i_unwritten==0.  But, as this
commit (c278531d) described, there is a time window that integrity is
broken.  So we need to call end_page_writeback() after converting
unwritten extents in ext4_end_io().  However, if we call end_page_writeback()
after conversion has been done in ext4_end_io(), we will get another deadlock
because in ext4_convert_unwritten_extents() we need to start a journal and it is
possible to cause a journal commit.  At the time if ext4_write_begin() is
called, it also will start a journal and then it will wait on writeback in
grab_cache_page_write_begin().

Now I have an idea to solve this problem.  We start a journal before submitting
an io request rather than start it in ext4_convert_unwritten_extents().  The
reason of starting a journal in ext4_convert_unwritten_extents() is that we need
to calculate credits for journal.  But as far as I understand the credits is not
increased in this function because we have splitted extents before submitting
this io request.  A 'handle_t *handle' will be added into ext4_io_end_t, and it
will be used in ext4_convert_unwritten_extents().  Then we can avoid to
trigger a journal commit when starting a journal.

Hope my description is clear.  Any comments or feedbacks are always welcome.


Jan, I don't know whether you have begun to try to fix this problem or not.  If
there has an update, please let me know.

Thanks,
						- Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux