Re: [PATCH v3] mm, netfs, fscache: Stop read optimisation when folio removed from pagecache

David Howells <dhowells@xxxxxxxxxx> · Wed, 23 Nov 2022 20:03:03 +0000

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> But I also think it's strange in another way, with that odd placement of
> 
>         mapping_clear_release_always(inode->i_mapping);
> 
> at inode eviction time. That just feels very random.

I was under the impression that a warning got splashed if unexpected
address_space flags were set when ->evict_inode() returned.  I may be thinking
of page flags.  If it doesn't, fine, this isn't required.

> Similarly, that change to shrink_folio_list() looks strange, with the
> nasty folio_needs_release() helper. It seems entirely pointless, with
> the use then being
> 
>                 if (folio_needs_release(folio)) {
>                         if (!filemap_release_folio(folio, sc->gfp_mask))
>                                 goto activate_locked;

Unfortunately, that can't be simply folded down.  It actually does something
extra if folio_has_private() was set, filemap_release_folio() succeeds but
there was no mapping:

		 * Rarely, folios can have buffers and no ->mapping.
		 * These are the folios which were not successfully
		 * invalidated in truncate_cleanup_folio().  We try to
		 * drop those buffers here and if that worked, and the
		 * folio is no longer mapped into process address space
		 * (refcount == 1) it can be freed.  Otherwise, leave
		 * the folio on the LRU so it is swappable.

Possibly I could split the if-statement and make it two separate cases:

		/*
		 * If the folio has buffers, try to free the buffer
		 * mappings associated with this folio. If we succeed
		 * we try to free the folio as well.
		 *
		 * We do this even if the folio is dirty.
		 * filemap_release_folio() does not perform I/O, but it
		 * is possible for a folio to have the dirty flag set,
		 * but it is actually clean (all its buffers are clean).
		 * This happens if the buffers were written out directly,
		 * with submit_bh(). ext3 will do this, as well as
		 * the blockdev mapping.  filemap_release_folio() will
		 * discover that cleanness and will drop the buffers
		 * and mark the folio clean - it can be freed.
		 */
		if (!filemap_release_folio(folio, sc->gfp_mask))
			goto activate_locked;

filemap_release_folio() will return true if folio_has_private() is false,
which would allow us to reach the next part, which we would then skip.

		/*
		 * Rarely, folios can have buffers and no ->mapping.
		 * These are the folios which were not successfully
		 * invalidated in truncate_cleanup_folio().  We try to
		 * drop those buffers here and if that worked, and the
		 * folio is no longer mapped into process address space
		 * (refcount == 1) it can be freed.  Otherwise, leave
		 * the folio on the LRU so it is swappable.
		 */
		if (!mapping && folio_has_private(folio) &&
		    folio_ref_count(folio) == 1) {
			folio_unlock(folio);
			if (folio_put_testzero(folio))
				goto free_it;
			 /*
			  * rare race with speculative reference.
			  * the speculative reference will free
			  * this folio shortly, so we may
			  * increment nr_reclaimed here (and
			  * leave it off the LRU).
			  */
			nr_reclaimed += nr_pages;
			continue;
		}

But that will malfunction if try_to_free_buffers(), as called from
folio_has_private(), manages to clear the private bits.  I wonder if it might
be possible to fold this bit into filemap_release_folio() somehow.

I really need a three-state return from filemap_release_folio() - maybe:

	0	couldn't release
	1	released
	2	there was no private

The ordinary "if (filemap_release_folio()) { ... }" would work as expected.
shrink_folio_list() could do something different between case 1 and case 2.

> And the change to mm/filemap.c is completely unacceptable in all
> forms, and this added test
> 
> +       if ((!mapping || !mapping_release_always(mapping)) &&
> +           !folio_test_private(folio) &&
> +           !folio_test_private_2(folio))
> +               return true;
> 
> will not be accepted even during the merge window. That code makes no
> sense what-so-ever, and is in no way acceptable.
>
> That code makes no sense what-so-ever. Why isn't it using
> "folio_has_private()"?

It should be, yes.

> Why is this done as an open-coded - and *badly* so - version of
> !folio_needs_release() that you for some reason made private to mm/vmscan.c?

Yeah, in retrospect, I should have put that in mm/internal.h.

David