Re: [PATCH] ext4: block direct I/O writes during ext4_truncate

"Theodore Ts'o" <tytso@xxxxxxx> · Tue, 16 Jul 2013 12:49:42 -0400

On Tue, Jul 16, 2013 at 05:46:58PM +0200, Jan Kara wrote:
> and ext4_setattr() does (again under i_mutex):
>                                         ext4_inode_block_unlocked_dio(inode);
>                                         inode_dio_wait(inode);
>                                         ext4_inode_resume_unlocked_dio(inode);

Ah, I missed this; thanks for pointing this out.

> So either DIO gets i_mutex first and then ext4_setattr() waits for it to
> complete, or truncate completes before unlocked DIO is able to get & drop
> i_mutex.
> 
> OTOH unlocked DIO *read* might be vulnerable to a race with truncate. That
> never acquires i_mutex so if the DIO read arrives after ext4_setattr() goes
> through inode_dio_wait(), we can have the read and truncate racing and read
> possibly submitting read of a just truncated block (which can get
> reallocated in theory while the IO is running).
> 
> So something like what you do in the patch is likely needed, just the
> justification is somewhat different and you should also rip out / adjust
> the other synchronizations we have in ext4_setattr(), ext4_ext_direct_IO()
> and ext4_ind_direct_IO().

Ok, I'll drop my current patch for now, and revisit this when I have a
bit more time.  I agree with your analysis, but fortunately it sounds
like this race is going to be pretty hard to hit in practice ---
especially, since with the journal enabled, we won't allow the block
to get reused until the next commit boundary.  The situation where we
would need to worry would be dioread_nolock combined with no journal
mode.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html