Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Thu, 16 Sep 2021 16:16:27 -0400

On Thu, Sep 16, 2021 at 03:15:29PM -0400, Theodore Ts'o wrote:
> On Thu, Sep 16, 2021 at 01:11:21PM -0400, James Bottomley wrote:
> > 
> > Actually, I don't see who should ack being an unknown.  The MAINTAINERS
> > file covers most of the kernel and a set of scripts will tell you based
> > on your code who the maintainers are ... that would seem to be the
> > definitive ack list.
> 
> It's *really* not that simple.  It is *not* the case that if a change
> touches a single line of fs/ext4 (as well as 60+ other filesystems),
> for example:
> 
> -       ei = kmem_cache_alloc(ext4_inode_cachep, GFP_NOFS);
> +       ei = alloc_inode_sb(sb, ext4_inode_cachep, GFP_NOFS);
> 
> that the submitter *must* get a ACK from me --- or that I am entitled
> to NACK the entire 79 patch series for any reason I feel like, or to
> withhold my ACK as hostage until the submitter does some development
> work that I want.
> 
> What typically happens is if someone were to try to play games like
> this inside, say, the Networking subsystem, past a certain point,
> David Miller will just take the patch series, ignoring people who have
> NACK's down if they can't be justified.  The difference is that even
> though Andrew Morton (the titular maintainer for all of Memory
> Management, per the MAINTAINERS file), Andrew seems to have a much
> lighter touch on how the mm subsystem is run.
> 
> > I think the problem is the ack list for features covering large areas
> > is large and the problems come when the acker's don't agree ... some
> > like it, some don't.  The only deadlock breaking mechanism we have for
> > this is either Linus yelling at everyone or something happening to get
> > everyone into alignment (like an MM summit meeting).  Our current model
> > seems to be every acker has a foot on the brake, which means a single
> > nack can derail the process.  It gets even worse if you get a couple of
> > nacks each requesting mutually conflicting things.
> > 
> > We also have this other problem of subsystems not being entirely
> > collaborative.  If one subsystem really likes it and another doesn't,
> > there's a fear in the maintainers of simply being overridden by the
> > pull request going through the liking subsystem's tree.  This could be
> > seen as a deadlock breaking mechanism, but fear of this happening
> > drives overreactions.
> > 
> > We could definitely do a clear definition of who is allowed to nack and
> > when can that be overridden.
> 
> Well, yes.  And this is why I think there is a process issue here that
> *is* within the MAINTAINERS SUMMIT purview, and if we need to
> technical BOF to settle the specific question of what needs to happen,
> whether it happens at LPC, or it needs to happen after LPC, then let's
> have it happen.

I would love to see us putting our energy into trying to have more productive
design discussions instead of getting more rules based. If someone feels
strongly enough to NACK a patch series, usually that's an indication of a
breakdown in communications and it means we need to put more effort into
figuring out what the real disagreement is. It's not like people usually NACK
things just to be petty - and if they are, that becomes apparent when we try to
communicate them to find out what the disagreement is and they don't respond
with the same effort.

And if people aren't being petty and are making a genuine effort to communicate
well and we're still not reaching a consensus - that does happen and there most
definitely are times when we just have differences of opinion and technical
judgement, and the maintainer will have to come to a decision. But before that
happens, we should make sure we've actually had a productive effective
discussion and figured out what those concerns and differences of opinion are,
so that the maintainer can make an _informed_ decision.

> I'd be really disappointed if we have to wait until December 2022 for
> the next LSF/MM, and if we don't get consensus there, ala DAX, that we
> then have to wait until late 2023, etc.  As others have said, this is
> holding up some work that file system developers would really like to
> see.

So I think we're still trying to answer the "what exactly is a folio" question.
As I see it, there's two potential approaches:

 - The minimalist approach, where folios are just pagecache pages

 - The maximalist approach, where folios are also anonymous pages. Potentially
   all pages that could be mapped into userspace would be folios, possibly with
   some work to unify weird driver things.

Network pages, slab pages aren't folios - they're their own thing. Folios are
also not a replacement for compound pages. Whichever way we go, folios are for
things that can be mapped into userspace.

Also: folios are a start on cutting up the unholy mess that is struct page into
separate data types. In struct page, we have a big nested union of structs, for
different types of pages. As I understand it from perusing the code, Willy has
been basically taking the approach of turning the first struct in the big
union-of-structs and (mostly?) making everything that uses that a folio.

I think that is reasonable, because it's basically adding types to describe the
world as it is - I would say that if it leaves things looking like a mess with
confused module boundaries between MM and FS, that's because the code was
already a mess, and while we should certainly work on cleaning that up those
cleanups shouldn't be done in _this_ giant patch series because that's how you
end up with bugs that you can't bisect.

However, Johannes has been pointing out that it's a real open question as to
whether anonymous pages should be folios! Willy's current code seems to leave
things in a somewhat intermediate state - some mm/ code treats anonymous pages
as folios, but it's not clear to me how much. And I still see a lot of
references to page->mapping; we should be clear on what's happening to those (if
the page is a folio, we should definitely not be referencing page->mapping or
page->index).

So: should anonymous pages be more like file pages? I think that's something
worth exploring, and potentially a lot of code could be unified and deleted with
that approach - a lot of the hugepage/transhuge code is doing similar stuff as
folios, but folios look to be doing it much cleaner. There's also things like
rmap.c, which is constantly asking is this page anonymous? is it file? and doing
different things that look somewhat similar (also KSM, but that's a whole nother
bag of crazy). Johannes things that anonymous pages differ too much from file
pages and that trying to unify them would be a mistake - perhaps he's right.
Perhaps we should create a new type analogous to folio for those pages - if all
the current places in the code where we're asking "Is this file? Is this anon?"
really do need to be doing that, then having our types match makes sense.