On Tue, Dec 17, 2024 at 11:14:42PM +0100, David Hildenbrand wrote: > On 17.12.24 06:12, Alistair Popple wrote: > > Currently ZONE_DEVICE page reference counts are initialised by core > > memory management code in __init_zone_device_page() as part of the > > memremap() call which driver modules make to obtain ZONE_DEVICE > > pages. This initialises page refcounts to 1 before returning them to > > the driver. > > > > This was presumably done because it drivers had a reference of sorts > > on the page. It also ensured the page could always be mapped with > > vm_insert_page() for example and would never get freed (ie. have a > > zero refcount), freeing drivers of manipulating page reference counts. > > It probably dates back to copying that code from other zone-init code where > we > (a) Treat all available-at-boot memory as allocated before we release it to > the buddy > (b) Treat all hotplugged memory as allocated until we release it to the > buddy Argh, thanks for the background. > As a side note, I'm working on converting (b) -- PageOffline pages -- to > have a refcount of 0 ("frozen"). [...] > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > index 24b68b4..f021e63 100644 > > --- a/mm/mm_init.c > > +++ b/mm/mm_init.c > > @@ -1017,12 +1017,26 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, > > } > > /* > > - * ZONE_DEVICE pages are released directly to the driver page allocator > > - * which will set the page count to 1 when allocating the page. > > + * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC and > > + * MEMORY_TYPE_FS_DAX pages are released directly to the driver page > > + * allocator which will set the page count to 1 when allocating the > > + * page. > > + * > > + * MEMORY_TYPE_GENERIC and MEMORY_TYPE_FS_DAX pages automatically have > > + * their refcount reset to one whenever they are freed (ie. after > > + * their refcount drops to 0). > > */ > > - if (pgmap->type == MEMORY_DEVICE_PRIVATE || > > - pgmap->type == MEMORY_DEVICE_COHERENT) > > + switch (pgmap->type) { > > + case MEMORY_DEVICE_PRIVATE: > > + case MEMORY_DEVICE_COHERENT: > > + case MEMORY_DEVICE_PCI_P2PDMA: > > set_page_count(page, 0); > > + break; > > + > > + case MEMORY_DEVICE_FS_DAX: > > + case MEMORY_DEVICE_GENERIC: > > + break; > > + } > > } > > /* > > > But that's a bit weird: we call __init_single_page()->init_page_count() to > initialize it to 1, to then set it back to 0. > > > Maybe we can just pass to __init_single_page() the refcount we want to have > directly? Can be a patch on top of course. Once the dust settles on this series we won't need the pgmap->type check at all because all ZONE_DEVICE pages will get an initial count of 0. I have some follow up clean-ups for after this series is applied (particularly with regards to pgmap refcounts), so if it's ok I'd rather do this as a follow-up. > Apart from that > > Acked-by: David Hildenbrand <david@xxxxxxxxxx> > > -- > Cheers, > > David / dhildenb >