On 4/18/2022 12:24 AM, Mike Rapoport wrote:
On Fri, Apr 15, 2022 at 02:30:52AM +0530, Sudarshan Rajagopalan wrote:
On 4/14/2022 2:18 AM, Mike Rapoport wrote:
On Tue, Apr 12, 2022 at 01:16:23PM -0700, Sudarshan Rajagopalan wrote:
Check if pfn is valid before or not before moving it to freelist.
There are possible scenario where a pageblock can have partial physical
hole and partial part of System RAM. This happens when base address in RAM
partition table is not aligned to pageblock size.
Example:
Say we have this first two entries in RAM partition table -
Base Addr: 0x0000000080000000 Length: 0x0000000058000000
Base Addr: 0x00000000E3930000 Length: 0x0000000020000000
I wonder what was done to memory DIMMs to get such an interesting
physical memory layout...
We have a feature where we carve out some portion of memory in RAM partition
table, hence we see such base addresses here.
Cannot the firmware align that portion at some sensible boundary?
Or at least report the carved out range as "reserved" (and maybe NOMAP)
rather than as hole?
We can have the firmware or ABL align the address to next pageblock size
boundary. This would simple mean loosing few MBs of memory with
alignment. Same with making them as "reserved" with "nomap".
Physical hole: 0xD8000000 - 0xE3930000
With the pageblock which has partial physical hole at the beginning, we will
run into PFNs from the physical hole whose struct page is not initialized and
is invalid, and system would crash as we operate on invalid struct page to find
out of page is in Buddy or LRU or not
struct page must be initialized and valid even for holes in the physical
memory. When a pageblock spans both existing memory and a hole, the struct
pages for the "hole" part should be marked as PG_Reserved.
If you see that struct pages for memory holes exist but invalid, we should
solve the underlying issue that causes wrong struct pages contents.
We are using 5.15 kernel, arm64 platform. For the pages belonging to the
physical hole, I don't see that pages are being initialized.
Looking into memmap_init code, we call init_unavailable_range() to
initialize the pages that belong to holes in the zone. But again we only do
this for PFNs that are valid according to below code snippet -
init_unavailable_range() {
6667 for (pfn = spfn; pfn < epfn; pfn++) {
6668 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
6669 pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
6670 + pageblock_nr_pages - 1;
6671 continue;
6672 }
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/mm/page_alloc.c?h=v5.15.34#n6668
With arm64 specific definition of pfn_valid(), a PFN which isn't present in
RAM partition table (i.e. belongs to physical hole), pfn_valid will return
FALSE. Hence we don't initialize any pages that belongs to physical hole
here.
Or am I missing anything in kernel that initializes pages belonging to
physical holes too? If so could you point me to that?
I agree with your analysis for 5.15, you just didn't mention that the
problem happens with 5.15.
I see that in next kernel versions, we are removing arm64 specific
definition of pfn_valid by Anshuman. Doing so, PFNs in hole would have
pfn_valid return TRUE and we would then initialize pages in holes as well.
That said, your patch will not fix anything in the current kernel because
the issue should not happen there, right?
Yes, the issue seems to be fixed in latest kernel version with the
patches to drop arm64 pfn_valid. But the core issue is present on
previous kernel versions with the scenario explained. Any procedure to
have this fixed on 5.15 kernel?
But this patch was reverted by Will Deacon on 5.15 kernel.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/arch/arm64/mm?h=v5.17.3&id=3de360c3fdb34fbdbaf6da3af94367d3fded95d3
The reason for the revert was fixed by the commit a9c38c5d267c
("dma-mapping: remove bogus test for pfn_valid from dma_map_resource").
...
Hence, avoid operating on invalid pages within the same pageblock by checking
if pfn is valid or not.
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@xxxxxxxxxxx>
Fixes: 4c7b9896621be ("mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE")
Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx>
For now the patch looks like a band-aid for more fundamental bug, so
NAKED-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
This patch may look like work around solution but yes I think there's a
fundamental problem where kernel takes a pageblock which has partial holes
and partial System RAM as valid pageblock, which occurs when Base Address in
RAM partition table are not aligned to pageblock size.
This fundamental problem needs to be fixed, and looking for your
suggestions.
I'd suggest backporting a9c38c5d267c ("dma-mapping: remove bogus test for
pfn_valid from dma_map_resource") and 3de360c3fdb3 ("arm64/mm: drop
HAVE_ARCH_PFN_VALID") to 5.15.
The issue is not seen with these patches backported. Not sure of the
procedure to send patches for 5.15 kernel, but can we have them
backported to 5.15?