On Tue, Jul 16, 2013 at 05:46:58PM +0200, Jan Kara wrote: > and ext4_setattr() does (again under i_mutex): > ext4_inode_block_unlocked_dio(inode); > inode_dio_wait(inode); > ext4_inode_resume_unlocked_dio(inode); Ah, I missed this; thanks for pointing this out. > So either DIO gets i_mutex first and then ext4_setattr() waits for it to > complete, or truncate completes before unlocked DIO is able to get & drop > i_mutex. > > OTOH unlocked DIO *read* might be vulnerable to a race with truncate. That > never acquires i_mutex so if the DIO read arrives after ext4_setattr() goes > through inode_dio_wait(), we can have the read and truncate racing and read > possibly submitting read of a just truncated block (which can get > reallocated in theory while the IO is running). > > So something like what you do in the patch is likely needed, just the > justification is somewhat different and you should also rip out / adjust > the other synchronizations we have in ext4_setattr(), ext4_ext_direct_IO() > and ext4_ind_direct_IO(). Ok, I'll drop my current patch for now, and revisit this when I have a bit more time. I agree with your analysis, but fortunately it sounds like this race is going to be pretty hard to hit in practice --- especially, since with the journal enabled, we won't allow the block to get reused until the next commit boundary. The situation where we would need to worry would be dioread_nolock combined with no journal mode. - Ted -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html