On Wed, 6 Apr 2022, Patrice CHOTARD wrote:
On 4/6/22 08:22, Hugh Dickins wrote:
Asking Arnd and others below: should noMMU arches have a good ZERO_PAGE?
On Tue, 5 Apr 2022, Hugh Dickins wrote:
On Tue, 5 Apr 2022, Patrice CHOTARD wrote:
We found an issue with last kernel tag v5.18-rc1 on stm32f746-disco and
stm32h743-disco boards (ARMV7-M SoCs).
Kernel hangs when executing SetPageUptodate(ZERO_PAGE(0)); in mm/filemap.c.
By reverting commit 56a8c8eb1eaf ("tmpfs: do not allocate pages on read"),
kernel boots without any issue.
Sorry about that, thanks a lot for finding.
I see that arch/arm/configs/stm32_defconfig says CONFIG_MMU is not set:
please confirm that is the case here.
Yes, it looks as if NOMMU platforms are liable to have a bogus (that's my
reading, but it may be unfair) definition for ZERO_PAGE(vaddr), and I was
walking on ice to touch it without regard for !CONFIG_MMU.
CONFIG_SHMEM depends on CONFIG_MMU, so that PageUptodate is only needed
when CONFIG_MMU.
Easily fixed by an #ifdef CONFIG_MMU there in mm/filemap.c, but I'll hunt
around (again) for a better place to do it - though I won't want to touch
all the architectures for it. I'll post later today.
I could put #ifdef CONFIG_MMU around the SetPageUptodate(ZERO_PAGE(0))
added to pagecache_init(); or if that's considered distasteful, I could
skip making it potentially useful to other filesystems, revert the change
to pagecache_init(), and just do it in mm/shmem.c's CONFIG_SHMEM (hence
CONFIG_MMU) instance of shmem_init().
But I wonder if it's safe for noMMU architectures to go on without a
working ZERO_PAGE(0). It has uses scattered throughout the tree, in
drivers, fs, crypto and more, and it's not at all obvious (to me) that
they all depend on CONFIG_MMU. Some might cause (unreported) crashes,
some might use an unzeroed page in place of a pageful of zeroes.
arm noMMU and h8300 noMMU and m68k noMMU each has
#define ZERO_PAGE(vaddr) (virt_to_page(0))
which seems riskily wrong to me.
h8300 and m68k actually go to the trouble of allocating an empty_zero_page
for this, but then forget to link it up to the ZERO_PAGE(vaddr) definition,
which is what all the common code uses.
arm noMMU does not presently allocate such a page; and I do not feel
entitled to steal a page from arm noMMU platforms, for a hypothetical
case, without agreement.
But here's an unbuilt and untested patch for consideration - which of
course should be split in three if agreed (and perhaps the h8300 part
quietly forgotten if h8300 is already on its way out).
(Yes, arm uses empty_zero_page in a different way from all the other
architectures; but that's okay, and I think arm's way, with virt_to_page()
already baked in, is better than the others; but I've no wish to get into
changing them.)
Patrice, does this patch build and run for you? I have no appreciation
of arm early startup issues, and may have got it horribly wrong.
This patch is okay on my side on both boards (STM32F7 and STM32H7), boot are OK.
Thanks for your reactivity ;-)
Patrice
Just to wrap up this thread: the tentative arch/ patches below did not
go into 5.18-rc2, but 5.18-rc3 will contain
1bdec44b1eee ("tmpfs: fix regressions from wider use of ZERO_PAGE")
which fixes a further issue, and deletes the line which gave you trouble.
With arch/h8300 removed from linux-next, and arch/arm losing a page by
the patch below, I don't think it's worth my arguing for those changes.
I'd still prefer arch/m68k to expose its empty_zero_page in ZERO_PAGE(),
or else not allocate it; but I won't be pursuing this further.
Thanks for reporting!
Hugh
arch/arm/include/asm/pgtable-nommu.h | 3 ++-
arch/arm/mm/nommu.c | 16 ++++++++++++++++
arch/h8300/include/asm/pgtable.h | 6 +++++-
arch/h8300/mm/init.c | 5 +++--
arch/m68k/include/asm/pgtable_no.h | 5 ++++-
5 files changed, 30 insertions(+), 5 deletions(-)
--- a/arch/arm/include/asm/pgtable-nommu.h
+++ b/arch/arm/include/asm/pgtable-nommu.h
@@ -48,7 +48,8 @@ typedef pte_t *pte_addr_t;
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+extern struct page *empty_zero_page;
+#define ZERO_PAGE(vaddr) (empty_zero_page)
/*
* Mark the prot value as uncacheable and unbufferable.
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -24,6 +24,13 @@
#include "mm.h"
+/*
+ * empty_zero_page is a special page that is used for
+ * zero-initialized data and COW.
+ */
+struct page *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);
+
unsigned long vectors_base;
#ifdef CONFIG_ARM_MPU
@@ -148,9 +155,18 @@ void __init adjust_lowmem_bounds(void)
*/
void __init paging_init(const struct machine_desc *mdesc)
{
+ void *zero_page;
+
early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
+
+ zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ if (!zero_page)
+ panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
+ __func__, PAGE_SIZE, PAGE_SIZE);
+ empty_zero_page = virt_to_page(zero_page);
+ flush_dcache_page(empty_zero_page);
}
/*
--- a/arch/h8300/include/asm/pgtable.h
+++ b/arch/h8300/include/asm/pgtable.h
@@ -19,11 +19,15 @@ extern void paging_init(void);
static inline int pte_file(pte_t pte) { return 0; }
#define swapper_pg_dir ((pgd_t *) 0)
+
+/* zero page used for uninitialized stuff */
+extern void *empty_zero_page;
+
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
/*
* These would be in other places but having them here reduces the diffs.
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -41,7 +41,8 @@
* ZERO_PAGE is a special page that is used for zero-initialized
* data and COW.
*/
-unsigned long empty_zero_page;
+void *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);
/*
* paging_init() continues the virtual memory environment setup which
@@ -65,7 +66,7 @@ void __init paging_init(void)
* Initialize the bad page table and bad page to point
* to a couple of allocated pages.
*/
- empty_zero_page = (unsigned long)memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
if (!empty_zero_page)
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
__func__, PAGE_SIZE, PAGE_SIZE);
--- a/arch/m68k/include/asm/pgtable_no.h
+++ b/arch/m68k/include/asm/pgtable_no.h
@@ -38,11 +38,14 @@ extern void paging_init(void);
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
#define __swp_entry_to_pte(x) ((pte_t) { (x).val })
+/* zero page used for uninitialized stuff */
+extern void *empty_zero_page;
+
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
/*
* All 32bit addresses are effectively valid for vmalloc...