Re: [PATCH v1 0/5] Add memory shrinker to VirtIO-GPU DRM driver

Rob Clark <robdclark@xxxxxxxxx> · Tue, 8 Mar 2022 14:24:22 -0800

On Tue, Mar 8, 2022 at 11:28 AM Dmitry Osipenko
<dmitry.osipenko@xxxxxxxxxxxxx> wrote:
>
> On 3/8/22 19:29, Rob Clark wrote:
> > On Tue, Mar 8, 2022 at 5:17 AM Dmitry Osipenko
> > <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
> >>
> >> Hello,
> >>
> >> This patchset introduces memory shrinker for the VirtIO-GPU DRM driver.
> >> During OOM, the shrinker will release BOs that are marked as "not needed"
> >> by userspace using the new madvise IOCTL. The userspace in this case is
> >> the Mesa VirGL driver, it will mark the cached BOs as "not needed",
> >> allowing kernel driver to release memory of the cached shmem BOs on lowmem
> >> situations, preventing OOM kills.
> >
> > Will host memory pressure already trigger shrinker in guest?
>
> The host memory pressure won't trigger shrinker in guest here. This
> series will help only with the memory pressure within the guest using a
> usual "virgl context".
>
> Having a host shrinker in a case of "virgl contexts" should be a
> difficult problem to solve.

Hmm, I think we just need the balloon driver to trigger the shrinker
in the guest kernel?  I suppose a driver like drm/virtio might want to
differentiate between host and guest pressure (ie. consider only
objects that have host vs guest storage), but even without that,
freeing up memory in the guest when host is under memory pressure
seems worthwhile.  Maybe I'm over-simplifying?

> > This is
> > something I'm quite interested in for "virtgpu native contexts" (ie.
> > native guest driver with new context type sitting on top of virtgpu),
>
> In a case of "native contexts" it should be doable, at least I can't see
> any obvious problems. The madvise invocations could be passed to the
> host using a new virtio-gpu command by the guest's madvise IOCTL
> handler, instead-of/in-addition-to handling madvise in the guest's
> kernel, and that's it.

I think we don't want to do that, because MADV:WILLNEED would be by
far the most frequent guest<->host synchronous round trip.  So from
that perspective tracking madvise state in guest kernel seems quite
attractive.

If we really can't track madvise state in the guest for dealing with
host memory pressure, I think the better option is to introduce
MADV:WILLNEED_REPLACE, ie. something to tell the host kernel that the
buffer is needed but the previous contents are not (as long as the GPU
VA remains the same).  With this the host could allocate new pages if
needed, and the guest would not need to wait for a reply from host.

> > since that isn't using host storage
>
> s/host/guest ?

Yes, sorry, I meant that it is not using guest storage.

BR,
-R