Hi Ted, On Fri, Aug 26, 2011 at 8:52 AM, Ted Ts'o <tytso@xxxxxxx> wrote: > On Fri, Aug 26, 2011 at 05:27:39PM +0800, Tao Ma wrote: >> No, it doesn't mean the ext4_truncate. But another race pasted below. >> >> Flush inode's i_completed_io_list before calling ext4_io_wait to >> prevent the following deadlock scenario: A page fault happens while >> some process is writing inode A. During page fault, >> shrink_icache_memory is called that in turn evicts another inode >> B. Inode B has some pending io_end work so it calls ext4_ioend_wait() >> that waits for inode B's i_ioend_count to become zero. However, inode >> B's ioend work was queued behind some of inode A's ioend work on the >> same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten >> thread on that cpu is processing inode A's ioend work, it tries to >> grab inode A's i_mutex lock. Since the i_mutex lock of inode A is >> still hold before the page fault happened, we enter a deadlock. > > ... but that shouldn't be a problem since we're not holding A's > i_mutex at this point, right? Or am I missing something? I think it is possible that we are holding A's i_mutex lock if the page fault happens while we are writing inode A. The problem is if we call ext4_evict_inode for inode B during the page fault handling and we just call ext4_ioend_wait() to wait for inode B's i_ioend_count to become zero, we rely on the ext4-dio-unwritten worker thread to finish any queued work at some time. But as mentioned in the change commit log, B's io_end work may be queued after A's work on the same cpu. Since A's i_mutex lock may be still hold during the page fault time, the ext4-dio-unwritten worker thread can't make progress. Now thinking about an alternative approach to resolve the deadlock mentioned above, maybe we can use mutex_trylock() in ext4_end_io_work() and if we can't grab the mutex lock for an inode, just requeue the work to the end of workqueue? Jiaying > > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html