Re: [PATCH 0/6] rust: page: Support borrowing `struct page` and physaddr conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05.02.25 08:40, Simona Vetter wrote:
On Tue, Feb 04, 2025 at 03:38:17PM +0100, David Hildenbrand wrote:
It can still race with memory offlining, and it refuses ZONE_DEVICE
pages. For the latter, we have a different way to check validity. See
memory_failure() that first calls pfn_to_online_page() to then check
get_dev_pagemap().

I'll give it a shot with these functions. If they work for my use case,
then it's good to have extra checks and I'll add them for v2. Thanks!

Let me know if you run into any issues.




If the answer is "no" then that's fine. It's still an unsafe function
and we need to document in the safety section that it should only be
used for memory that is either known to be allocated and pinned and will
not be freed while the `struct page` is borrowed, or memory that is
reserved and not owned by the buddy allocator, so in practice correct
use would not be racy with memory hot-remove anyway.

This is already the case for the drm/asahi use case, where the pfns
looked up will only ever be one of:

- GEM objects that are mapped to the GPU and whose physical pages are
therefore pinned (and the VM is locked while this happens so the objects
cannot become unpinned out from under the running code),

How exactly are these pages pinned/obtained?

Under the hood it's shmem. For pinning, it winds up at
`drm_gem_get_pages()`, which I think does a `shmem_read_folio_gfp()` on
a mapping set as unevictable.

Thanks. So we grab another folio reference via
shmem_read_folio_gfp()->shmem_get_folio_gfp().

Hm, I wonder if we might end up holding folios residing in
ZONE_MOVABLE/MIGRATE_CMA longer than we should.

Compared to memfd_pin_folios(), which simulates FOLL_LONGTERM and makes sure
to migrate pages out of ZONE_MOVABLE/MIGRATE_CMA.

But that's a different discussion, just pointing it out, maybe I'm missing
something :)

Good GPU Drivers (tm) are supposed to have a shrinker so we can at least
nuke some of them again. Some folks even looked into hooking up a migrate
callback through the address_space (or wherever that hook was, this is
from memory) so we can make this somewhat reliable. So yeah we're hogging
ZONE_MOVEABLE unduly still.

Hmmm, so we should really be migrating pages out of ZONE_MOVABLE/MIGRATE_CMA here, just like we now properly do in memfd_pin_folios().


The other side is that there's about 2-3 good drivers (msm, i915, xe
should have a shrinker now too but I didn't check). The others all fall
various levels of short, or still have 3 times cargo-culted versions of
i915's pin-as-a-lock design and get it completely wrong.

:(


So yeah I'm aware this isn't great, and we're at least glacially slowly
moving towards a common shrinker infrastructure that maybe in a glorious
future gets all this right. I mean it took us 15+ years to get to a
cgroups controller after all too, and that was also a well known issue of
just being able to hog memory with no controls and potentially cause
havoc.

I guess that means job security for us, haha :)

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux