On Mon, Jan 29, 2024 at 10:11:24AM +0100, Jan Kara wrote: > On Thu 25-01-24 14:06:58, Matthew Wilcox wrote: > > On Thu, Jan 25, 2024 at 01:09:46PM +0000, Roman Smirnov wrote: > > > Syzkaller reports warning in ext4_set_page_dirty() in 5.10 and 5.15 > > > stable releases. It happens because invalidate_inode_page() frees pages > > > that are needed for the system. To fix this we need to add additional > > > checks to the function. page_mapped() checks if a page exists in the > > > page tables, but this is not enough. The page can be used in other places: > > > https://elixir.bootlin.com/linux/v6.8-rc1/source/include/linux/page_ref.h#L71 > > > > > > Kernel outputs an error line related to direct I/O: > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=14ab52dac80000 > > > > OK, this is making a lot more sense. > > > > The invalidate_inode_page() path (after the page_mapped check) calls > > try_to_release_page() which strips the buffers from the page. > > __remove_mapping() tries to freeze the page and presuambly fails. > > Yep, likely. > > > ext4 is checking there are still buffer heads attached to the page. > > I'm not sure why it's doing that; it's legitimate to strip the > > bufferheads from a page and then reattach them later (if they're > > attached to a dirty page, they are created dirty). > > Well, we really need to track dirtiness on per fs-block basis in ext4 > (which makes a difference when blocksize < page size). For example for > delayed block allocation we reserve exactly as many blocks as we need > (which need not be all the blocks in the page e.g. when writing just one > block in the middle of a large hole). So when all buffers would be marked > as dirty we would overrun our reservation. Hence at the moment of dirtying > we really need buffers to be attached to the page and stay there until the > page is written back. Thanks for the clear explanation! Isn't the correct place to ensure that this is true in ext4_release_folio()? I think all paths to remove buffer_heads from a folio go through ext4_release_folio() and so it can be prohibited here if the folio is part of a delalloc extent? I worry that the proposed fix here cuts off only one path to hitting this WARN_ON and we need a more comprehensive fix.