On Wed, 6 Apr 2022, Patrice CHOTARD wrote: > On 4/6/22 08:22, Hugh Dickins wrote: > > Asking Arnd and others below: should noMMU arches have a good ZERO_PAGE? > > > > On Tue, 5 Apr 2022, Hugh Dickins wrote: > >> On Tue, 5 Apr 2022, Patrice CHOTARD wrote: > >>> > >>> We found an issue with last kernel tag v5.18-rc1 on stm32f746-disco and > >>> stm32h743-disco boards (ARMV7-M SoCs). > >>> > >>> Kernel hangs when executing SetPageUptodate(ZERO_PAGE(0)); in mm/filemap.c. > >>> > >>> By reverting commit 56a8c8eb1eaf ("tmpfs: do not allocate pages on read"), > >>> kernel boots without any issue. > >> > >> Sorry about that, thanks a lot for finding. > >> > >> I see that arch/arm/configs/stm32_defconfig says CONFIG_MMU is not set: > >> please confirm that is the case here. > >> > >> Yes, it looks as if NOMMU platforms are liable to have a bogus (that's my > >> reading, but it may be unfair) definition for ZERO_PAGE(vaddr), and I was > >> walking on ice to touch it without regard for !CONFIG_MMU. > >> > >> CONFIG_SHMEM depends on CONFIG_MMU, so that PageUptodate is only needed > >> when CONFIG_MMU. > >> > >> Easily fixed by an #ifdef CONFIG_MMU there in mm/filemap.c, but I'll hunt > >> around (again) for a better place to do it - though I won't want to touch > >> all the architectures for it. I'll post later today. > > > > I could put #ifdef CONFIG_MMU around the SetPageUptodate(ZERO_PAGE(0)) > > added to pagecache_init(); or if that's considered distasteful, I could > > skip making it potentially useful to other filesystems, revert the change > > to pagecache_init(), and just do it in mm/shmem.c's CONFIG_SHMEM (hence > > CONFIG_MMU) instance of shmem_init(). > > > > But I wonder if it's safe for noMMU architectures to go on without a > > working ZERO_PAGE(0). It has uses scattered throughout the tree, in > > drivers, fs, crypto and more, and it's not at all obvious (to me) that > > they all depend on CONFIG_MMU. Some might cause (unreported) crashes, > > some might use an unzeroed page in place of a pageful of zeroes. > > > > arm noMMU and h8300 noMMU and m68k noMMU each has > > #define ZERO_PAGE(vaddr) (virt_to_page(0)) > > which seems riskily wrong to me. > > > > h8300 and m68k actually go to the trouble of allocating an empty_zero_page > > for this, but then forget to link it up to the ZERO_PAGE(vaddr) definition, > > which is what all the common code uses. > > > > arm noMMU does not presently allocate such a page; and I do not feel > > entitled to steal a page from arm noMMU platforms, for a hypothetical > > case, without agreement. > > > > But here's an unbuilt and untested patch for consideration - which of > > course should be split in three if agreed (and perhaps the h8300 part > > quietly forgotten if h8300 is already on its way out). > > > > (Yes, arm uses empty_zero_page in a different way from all the other > > architectures; but that's okay, and I think arm's way, with virt_to_page() > > already baked in, is better than the others; but I've no wish to get into > > changing them.) > > > > Patrice, does this patch build and run for you? I have no appreciation > > of arm early startup issues, and may have got it horribly wrong. > > This patch is okay on my side on both boards (STM32F7 and STM32H7), boot are OK. > > Thanks for your reactivity ;-) > Patrice Just to wrap up this thread: the tentative arch/ patches below did not go into 5.18-rc2, but 5.18-rc3 will contain 1bdec44b1eee ("tmpfs: fix regressions from wider use of ZERO_PAGE") which fixes a further issue, and deletes the line which gave you trouble. With arch/h8300 removed from linux-next, and arch/arm losing a page by the patch below, I don't think it's worth my arguing for those changes. I'd still prefer arch/m68k to expose its empty_zero_page in ZERO_PAGE(), or else not allocate it; but I won't be pursuing this further. Thanks for reporting! Hugh > > > > arch/arm/include/asm/pgtable-nommu.h | 3 ++- > > arch/arm/mm/nommu.c | 16 ++++++++++++++++ > > arch/h8300/include/asm/pgtable.h | 6 +++++- > > arch/h8300/mm/init.c | 5 +++-- > > arch/m68k/include/asm/pgtable_no.h | 5 ++++- > > 5 files changed, 30 insertions(+), 5 deletions(-) > > > > --- a/arch/arm/include/asm/pgtable-nommu.h > > +++ b/arch/arm/include/asm/pgtable-nommu.h > > @@ -48,7 +48,8 @@ typedef pte_t *pte_addr_t; > > * ZERO_PAGE is a global shared page that is always zero: used > > * for zero-mapped memory areas etc.. > > */ > > -#define ZERO_PAGE(vaddr) (virt_to_page(0)) > > +extern struct page *empty_zero_page; > > +#define ZERO_PAGE(vaddr) (empty_zero_page) > > > > /* > > * Mark the prot value as uncacheable and unbufferable. > > --- a/arch/arm/mm/nommu.c > > +++ b/arch/arm/mm/nommu.c > > @@ -24,6 +24,13 @@ > > > > #include "mm.h" > > > > +/* > > + * empty_zero_page is a special page that is used for > > + * zero-initialized data and COW. > > + */ > > +struct page *empty_zero_page; > > +EXPORT_SYMBOL(empty_zero_page); > > + > > unsigned long vectors_base; > > > > #ifdef CONFIG_ARM_MPU > > @@ -148,9 +155,18 @@ void __init adjust_lowmem_bounds(void) > > */ > > void __init paging_init(const struct machine_desc *mdesc) > > { > > + void *zero_page; > > + > > early_trap_init((void *)vectors_base); > > mpu_setup(); > > bootmem_init(); > > + > > + zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > > + if (!zero_page) > > + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", > > + __func__, PAGE_SIZE, PAGE_SIZE); > > + empty_zero_page = virt_to_page(zero_page); > > + flush_dcache_page(empty_zero_page); > > } > > > > /* > > --- a/arch/h8300/include/asm/pgtable.h > > +++ b/arch/h8300/include/asm/pgtable.h > > @@ -19,11 +19,15 @@ extern void paging_init(void); > > > > static inline int pte_file(pte_t pte) { return 0; } > > #define swapper_pg_dir ((pgd_t *) 0) > > + > > +/* zero page used for uninitialized stuff */ > > +extern void *empty_zero_page; > > + > > /* > > * ZERO_PAGE is a global shared page that is always zero: used > > * for zero-mapped memory areas etc.. > > */ > > -#define ZERO_PAGE(vaddr) (virt_to_page(0)) > > +#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page)) > > > > /* > > * These would be in other places but having them here reduces the diffs. > > --- a/arch/h8300/mm/init.c > > +++ b/arch/h8300/mm/init.c > > @@ -41,7 +41,8 @@ > > * ZERO_PAGE is a special page that is used for zero-initialized > > * data and COW. > > */ > > -unsigned long empty_zero_page; > > +void *empty_zero_page; > > +EXPORT_SYMBOL(empty_zero_page); > > > > /* > > * paging_init() continues the virtual memory environment setup which > > @@ -65,7 +66,7 @@ void __init paging_init(void) > > * Initialize the bad page table and bad page to point > > * to a couple of allocated pages. > > */ > > - empty_zero_page = (unsigned long)memblock_alloc(PAGE_SIZE, PAGE_SIZE); > > + empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > > if (!empty_zero_page) > > panic("%s: Failed to allocate %lu bytes align=0x%lx\n", > > __func__, PAGE_SIZE, PAGE_SIZE); > > --- a/arch/m68k/include/asm/pgtable_no.h > > +++ b/arch/m68k/include/asm/pgtable_no.h > > @@ -38,11 +38,14 @@ extern void paging_init(void); > > #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) }) > > #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) > > > > +/* zero page used for uninitialized stuff */ > > +extern void *empty_zero_page; > > + > > /* > > * ZERO_PAGE is a global shared page that is always zero: used > > * for zero-mapped memory areas etc.. > > */ > > -#define ZERO_PAGE(vaddr) (virt_to_page(0)) > > +#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page)) > > > > /* > > * All 32bit addresses are effectively valid for vmalloc... >