Re: [PATCH v4 02/34] ext4: check the extent status again before inserting delalloc block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Wed, Apr 10, 2024 at 10:29:16PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
> Now we lookup extent status entry without holding the i_data_sem before
> inserting delalloc block, it works fine in buffered write path and
> because it holds i_rwsem and folio lock, and the mmap path holds folio
> lock, so the found extent locklessly couldn't be modified concurrently.
> But it could be raced by fallocate since it allocate block whitout
> holding i_rwsem and folio lock.
> ext4_page_mkwrite()             ext4_fallocate()
>  block_page_mkwrite()
>   ext4_da_map_blocks()
>    //find hole in extent status tree
>                                  ext4_alloc_file_blocks()
>                                   ext4_map_blocks()
>                                    //allocate block and unwritten extent
>    ext4_insert_delayed_block()
>     ext4_da_reserve_space()
>      //reserve one more block
>     ext4_es_insert_delayed_block()
>      //drop unwritten extent and add delayed extent by mistake

Shouldn't this be serialised by the file invalidation lock?  Hole
punching via fallocate must do this to avoid data use-after-free
bugs w.r.t racing page faults and all the other fallocate ops need
to serialise page faults to avoid page cache level data corruption.
Yet here we see a problem resulting from a fallocate operation
racing with a page fault....

Ah, I see that the invalidation lock is only picked up deep inside
ext4_punch_hole(), ext4_collapse_range(), ext4_insert_range() and
ext4_zero_range(). They all do the same flush, lock, and dio wait
preamble but each do it just a little bit differently. The allocation path does
it just a little bit differently again and does not take the
invalidate lock...

Perhaps the ext4 fallocate code should be factored so that all the
fallocate operations run the same flush, lock and wait code rather
than having 5 slightly different copies of the same code?


Dave Chinner

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux