On Thu 23-11-23 15:17:28, Ritesh Harjani wrote: > Jan Kara <jack@xxxxxxx> writes: > > > On Thu 23-11-23 12:37:03, Ritesh Harjani wrote: > >> Jan Kara <jack@xxxxxxx> writes: > >> > >> > The syzbot has reported that it can hit the warning in > >> > ext4_dio_write_end_io() because i_size < i_disksize. Indeed the > >> > reproducer creates a race between DIO IO completion and truncate > >> > expanding the file and thus ext4_dio_write_end_io() sees an inconsistent > >> > inode state where i_disksize is already updated but i_size is not > >> > updated yet. Since we are careful when setting up DIO write and consider > >> > it extending (and thus performing the IO synchronously with i_rwsem held > >> > exclusively) whenever it goes past either of i_size or i_disksize, we > >> > can use the same test during IO completion without risking entering > >> > ext4_handle_inode_extension() without i_rwsem held. This way we make it > >> > obvious both i_size and i_disksize are large enough when we report DIO > >> > completion without relying on unreliable WARN_ON. > >> > >> Does it make sense to add this in ext4_handle_inode_extension()? > >> WARN_ON_ONCE(!inode_is_locked(inode)); > >> Ohk, we already have "lockdep_assert_held_write(&inode->i_rwsem)" so > >> hopefully it can catch via lockdep. > > > > Exactly. > > > >> So, IIUC, the WARN happened when we were doing a non-extending > >> AIO-DIO write which was racing with truncate trying to expand the file > >> size. Because only then the DIO completion will not have i_rwsem held > >> which can race with truncate. Truncate since it is expanding the file > >> size, will not use inode_dio_wait() (since no block allocations). > >> > >> Is this understanding correct? > > > > Yes, correct. > > Thanks Jan, > > Also ext4_inode_extension_cleanup() function can take care of deleting > the inode from the orphan list in case if there is a race with truncate > which extended made both i_disksize and inode->i_size and the DIO > completion couldn't call ext4_handle_inode_extension(), right? > > In that case, does it make sense to update a comment here too? > > @@ -350,7 +350,10 @@ static void ext4_inode_extension_cleanup(struct inode *inode, ssize_t count) > } > /* > * If i_disksize got extended due to writeback of delalloc blocks while > - * the DIO was running we could fail to cleanup the orphan list in > + * the DIO was running, or > + * If i_disksize and inode->i_size both got extened during truncate > + * which raced with DIO completion, > + * In both such cases, we could fail to cleanup the orphan list in > * ext4_handle_inode_extension(). Do it now. > */ > if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) { Good point. Expanded comment in this way. I'll send v2 shortly. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR