On Wed 10-04-24 11:41:56, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@xxxxxxxxxx> > > Now we lookup extent status entry without holding the i_data_sem before > inserting delalloc block, it works fine in buffered write path and > because it holds i_rwsem and folio lock, and the mmap path holds folio > lock, so the found extent locklessly couldn't be modified concurrently. > But it could be raced by fallocate since it allocate block whitout > holding i_rwsem and folio lock. > > ext4_page_mkwrite() ext4_fallocate() > block_page_mkwrite() > ext4_da_map_blocks() > //find hole in extent status tree > ext4_alloc_file_blocks() > ext4_map_blocks() > //allocate block and unwritten extent > ext4_insert_delayed_block() > ext4_da_reserve_space() > //reserve one more block > ext4_es_insert_delayed_block() > //drop unwritten extent and add delayed extent by mistake > > Then, the delalloc extent is wrong until writeback, the one more > reserved block can't be release any more and trigger below warning: > > EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared! > > Hold i_data_sem in write mode directly can fix the problem, but it's > expansive, we should keep the lockless check and check the extent again > once we need to add an new delalloc block. > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > fs/ext4/inode.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 6a41172c06e1..118b0497a954 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, > if (ext4_es_is_hole(&es)) > goto add_delayed; > > +found: > /* > * Delayed extent could be allocated by fallocate. > * So we need to check it. > @@ -1781,6 +1782,24 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, > > add_delayed: > down_write(&EXT4_I(inode)->i_data_sem); > + /* > + * Lookup extents tree again under i_data_sem, make sure this > + * inserting delalloc range haven't been delayed or allocated > + * whitout holding i_rwsem and folio lock. > + */ > + if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) { > + if (!ext4_es_is_hole(&es)) { > + up_write(&EXT4_I(inode)->i_data_sem); > + goto found; > + } > + } else if (!ext4_has_inline_data(inode)) { > + retval = ext4_map_query_blocks(NULL, inode, map); > + if (retval) { > + up_write(&EXT4_I(inode)->i_data_sem); > + return retval; > + } > + } > + > retval = ext4_insert_delayed_block(inode, map->m_lblk); > up_write(&EXT4_I(inode)->i_data_sem); > if (retval) > -- > 2.39.2 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR