Re: [PATCH v9 07/10] mm: Device exclusive memory access

Alistair Popple <apopple@xxxxxxxxxx> · Tue, 25 May 2021 19:21:00 +1000

On Tuesday, 25 May 2021 11:31:17 AM AEST John Hubbard wrote:
> On 5/24/21 3:11 PM, Andrew Morton wrote:
> >> ...
> >> 
> >>   Documentation/vm/hmm.rst     |  17 ++++
> >>   include/linux/mmu_notifier.h |   6 ++
> >>   include/linux/rmap.h         |   4 +
> >>   include/linux/swap.h         |   7 +-
> >>   include/linux/swapops.h      |  44 ++++++++-
> >>   mm/hmm.c                     |   5 +
> >>   mm/memory.c                  | 128 +++++++++++++++++++++++-
> >>   mm/mprotect.c                |   8 ++
> >>   mm/page_vma_mapped.c         |   9 +-
> >>   mm/rmap.c                    | 186 +++++++++++++++++++++++++++++++++++
> >>   10 files changed, 405 insertions(+), 9 deletions(-)
> > 
> > This is quite a lot of code added to core MM for a single driver.
> > 
> > Is there any expectation that other drivers will use this code?
> 
> Yes! This should work for GPUs (and potentially, other devices) that support
> OpenCL SVM atomic accesses on the device. I haven't looked into how amdgpu
> works in any detail, but that's certainly at the top of the list of likely
> additional callers.
> 
> > Is there a way of reducing the impact (code size, at least) for systems
> > which don't need this code?

All of the code added to mm/rmap.c is specific to implementing this feature 
and not depended on by other core MM code so could be put behind something 
like CONFIG_DEVICE_PRIVATE to reduce the code size impact (I realise now it 
currently isn't but should be).

The impact on compiled code size in mm/memory.c also ends up being minimised 
by the compiler because all of it is of the form:

if (is_device_exclusive_entry(...)) {
	[...]
}

Meaning it should get thrown away when the feature is not configured given 
is_device_exclusive_entry() is a static inline always returning false in that 
case.

> I'll leave this question to others for the moment, in order to answer
> the "do we need it at all" points.
> 
> > How beneficial is this code to nouveau users?  I see that it permits a
> > part of OpenCL to be implemented, but how useful/important is this in
> > the real world?
> 
> So this is interesting. Right now, OpenCL support in Nouveau is rather new
> and so probably not a huge impact yet. However, we've built up enough
> experience with CUDA and OpenCL to learn that atomic operations, as part of
> the user space programming model, are a super big deal. Atomic operations
> are so useful and important that I'd expect many OpenCL SVM users to be
> uninterested in programming models that lack atomic operations for GPU
> compute programs.
> 
> Again, this doesn't rule out future, non-GPU accelerator devices that may
> come along.
> 
> Atomic ops are just a really important piece of high-end multi-threaded
> programming, it turns out. So this is the beginning of support for an
> important building block for general purpose programming on devices that
> have GPU-like memory models.
> 
> 
> thanks,