Re: [PATCH v2] mm/page_alloc: Add lockdep assertion for pageblock type change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03.03.25 17:00, Brendan Jackman wrote:
On Mon, Mar 03, 2025 at 03:06:54PM +0100, David Hildenbrand wrote:
On 03.03.25 14:55, Brendan Jackman wrote:
On Mon, Mar 03, 2025 at 02:11:23PM +0100, David Hildenbrand wrote:
On 03.03.25 13:13, Brendan Jackman wrote:
Since the migratetype hygiene patches [0], the locking here is
a bit more formalised.

For other stuff, it's pretty obvious that it would be protected by the
zone lock. But it didn't seem totally self-evident that it should
protect the pageblock type. So it seems particularly helpful to have it
written in the code.

[...]

+
    u64 max_mem_size = U64_MAX;
    /* add this memory to iomem resource */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 579789600a3c7bfb7b0d847d51af702a9d4b139a..1ed21179676d05c66f77f9dbebf88e36bbe402e9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -417,6 +417,10 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
    void set_pageblock_migratetype(struct page *page, int migratetype)
    {
+	lockdep_assert_once(system_state == SYSTEM_BOOTING ||
+		in_mem_hotplug() ||
+		lockdep_is_held(&page_zone(page)->lock));
+

I assume the call chain on the memory hotplug path is mostly

move_pfn_range_to_zone()->memmap_init_range()->set_pageblock_migratetype()

either when onlining a memory block, or from pagemap_range() while holding
the hotplug lock.

But there is also the memmap_init_zone_device()->memmap_init_compound()->__init_zone_device_page()->set_pageblock_migratetype()
one, called from pagemap_range() *without* holding the hotplug lock, and you
assertion would be missing that.

I'm not too happy about that assertion in general.

Hmm, thanks for pointing that out.

I guess if we really wanted the assertion the approach would be to
replace in_mem_hotplug() with some more fine-grained logic about the
state of the pageblock? But that seems like it would require rework
that isn't really justified.

I was wondering if we could just grab the zone lock while initializing, then
assert that we either hold that or are in boot.

Would that be because you want to avoid creating in_mem_hotplug()? Or
is it more about just simplifying the synchronization in general?

A little bit of both. The question is if lockless resizing of the zone range today is a bug or a feature :)

I'm not aware of any known side-effects of that, but if we could add locking without causing noticeable overheads, that would certainly be easiest ...


FWIW I don't think the in_mem_hotplug() is really that bad in the
assertion, it feels natural to me that memory hotplug would be an
exception to the locking rules in the same way that startup would be.
> >> In move_pfn_range_to_zone() it should likely not cause too much harm, and we
could just grab it around all zone modification stuff.

memmap_init_zone_device() might take longer and be more problematic.

But I am not sure why memmap_init_zone_device() would have to set the
migratetype at all? Because migratetype is a buddy concept, and
ZONE_DEVICE does not interact with the buddy to that degree.

The comment in __init_zone_device_page states:

"Mark the block movable so that blocks are reserved for movable at
startup. This will force kernel allocations to reserve their blocks
rather than leaking throughout the address space during boot when
many long-lived kernel allocations are made."

Uh, yeah I was pretty mystified by that. It would certainly be nice if
we can just get rid of this modification path.

But that just dates back to 966cf44f637e where we copy-pasted that code.

So I wonder if we could just

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 57933683ed0d1..b95f545846e6e 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1002,19 +1002,11 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
         page->zone_device_data = NULL;
         /*
-        * Mark the block movable so that blocks are reserved for
-        * movable at startup. This will force kernel allocations
-        * to reserve their blocks rather than leaking throughout
-        * the address space during boot when many long-lived
-        * kernel allocations are made.
-        *
-        * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
-        * because this is done early in section_activate()
+        * Note that we leave pageblock migratetypes uninitialized, because
+        * they don't apply to ZONE_DEVICE.
          */
-       if (pageblock_aligned(pfn)) {
-               set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+       if (pageblock_aligned(pfn))
                 cond_resched();
-       }
         /*
          * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released

memory-model.rst says:

Since the
page reference count never drops below 1 the page is never tracked as
free memory and the page's `struct list_head lru` space is repurposed
for back referencing to the host device / driver that mapped the memory.

That comment will be stale soon. In general, ZONE_DEVICE refcounts can drop to 0, but they will never go to the buddy, but will get intercepted on the freeing path.


And this code seems to assume that the whole pageblock is part of the
ZONE_DEVICE dance, it would certainly make sense to me...

Sorry, I didn't get your final conclusion: do you thing we don't have to initialize the migratetype, or do you think there is reason to still do it?

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux