Re: RE(2): FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 24 Mar 2023 17:49:00 +0000

On Fri, Mar 24, 2023 at 02:55:02PM +0000, Matthew Wilcox wrote:
> No, that's not true.  You can allocate kernel memory from ZONE_MOVABLE.
> You have to be careful when you do that, but eg filesystems put symlinks
> and directories in ZONE_MOVABLE, and zswap allocates memory from
> ZONE_MOVABLE.  Of course, then you have to be careful that the kernel
> doesn't try to move it while you're accessing it.  That's the tradeoff.

I want to talk a little bit about what it would take to use MOVABLE
allocations for slab.

Initially, one might presume that it is impossible to have slab use a
movable allocation.  Usually, we need a relatively complex mechanism of
reference counting where one takes a reference on the page, uses it,
then puts the reference.  Then migration can check the page reference
and if it's unused, it knows it's safe to migrate (much handwaving here,
of course it's more complex).

The general case of kmalloc slabs cannot use MOVABLE allocations.
The API has no concept of "this pointer is temporarily not in use",
so we can never migrate any slab which has allocated objects.

But for slab caches, individual objects may have access rules which allow
them to be moved.  For example, we might be able to migrate every dentry
in a slab, then RCU-free the slab.  Similarly for radix_tree_nodes.

There was some work along these lines a few years ago:
https://lore.kernel.org/all/20190603042637.2018-16-tobin@xxxxxxxxxx/

There are various practical problems with that patchset, but they can
be overcome with sufficient work.  The question is: Why do we need to do
this work?  What is the high-level motivation to make slab caches movable?