Re: [PATCHv7 3/6] iomap: Refactor some iop related accessor functions

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Tue, 6 Jun 2023 17:29:48 +0100

On Tue, Jun 06, 2023 at 09:03:17AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 06, 2023 at 05:21:32AM +0530, Ritesh Harjani wrote:
> > So, I do have a confusion in __folio_mark_dirty() function...
> > 
> > i.e. __folio_mark_dirty checks whether folio->mapping is not NULL.
> > That means for marking range of blocks dirty within iop from
> > ->dirty_folio(), we can't use folio->mapping->host is it?
> > We have to use inode from mapping->host (mapping is passed as a
> > parameter in ->dirty_folio).

It probably helps to read the commentary above filemap_dirty_folio().

 * The caller must ensure this doesn't race with truncation.  Most will
 * simply hold the folio lock, but e.g. zap_pte_range() calls with the
 * folio mapped and the pte lock held, which also locks out truncation.

But __folio_mark_dirty() can't rely on that!  Again, see the commentary:

 * This can also be called from mark_buffer_dirty(), which I
 * cannot prove is always protected against truncate.

iomap doesn't do bottom-up dirtying, only top-down.  So it absolutely
can rely on the VFS having taken the appropriate locks.

> Ah, yeah.  folio->mapping can become NULL if truncate races with us in
> removing the folio from the foliocache.
> 
> For regular reads and writes this is a nonissue because those paths all
> take i_rwsem and will block truncate.  However, for page_mkwrite, xfs
> doesn't take mmap_invalidate_lock until after the vm_fault has been
> given a folio to play with.

invalidate_lock isn't needed here.  You take the folio_lock, then you
call folio_mkwrite_check_truncate() to make sure it wasn't truncated
before you took the folio_lock.  Truncation will block on the folio_lock,
so you're good unless you release the folio_lock (which you don't,
you return it to the MM locked).