On Wed, Jan 22, 2025 at 03:06:05PM +0100, Alexander Gordeev wrote: Hi Kevin, > On Wed, Jan 22, 2025 at 08:49:54AM +0100, Heiko Carstens wrote: > > > > static inline pgd_t *pgd_alloc(struct mm_struct *mm) > > > > { > > > > - return (pgd_t *) crst_table_alloc(mm); > > > > + unsigned long *table = crst_table_alloc(mm); > > > > + > > > > + if (!table) > > > > + return NULL; > > > > > > I do not know status of this series, but FWIW, this call is missed: > > > > > > crst_table_init(table, _REGION1_ENTRY_EMPTY); > > > > Why is that missing? > > Because the follow-up pagetable_pgd_ctor() is called against uninitialized > page table, while other pagetable_pXd_ctor() variants are called against > initialized one. I could imagine complications as result of that. > > Whether Region1 table is the right choice is a big question though, as you > noticed below. As discussed with Heiko, we do not want to add the extra crst_table_init() call at least due to performance impact. So please ignore my comment. > > A pgd table can be a Region1, Region2, or Region3 table. The only caller of > > this function is mm_init() via mm_alloc_pgd(); and right after mm_alloc_pgd() > > there is a call to init_new_context() which will initialize the pgd correctly. > > init_new_context() is in a way a constructor as well, so whole thing looks odd > to me. But I do not immedeately see a better way :( > > > I guess what really gets odd, and might be broken (haven't checked yet) is > > what happens on dynamic upgrade of page table levels (->crst_table_upgrade()). > > Hmm, that is a good point. > > > With that a pgd may become a pud, and with that we get an imbalance with > > the ctor/dtor calls for the various page table levels when they get freed > > The ctor/dtor mismatch should not be a problem, as pagetable_pgd|p4d|pud_ctor() > are the same and there is one pagetable_dtor() for all top levels as of now. > But if it ever comes to separate implementations, then we are in the world > of pain. > > > again. Plus, at first glance, it looks also broken that we have open-coded > > crst_alloc() calls instead of using the "proper" page table allocation API > > within crst_table_upgrade(), which again would cause an imbalance. > > This is a good point too. The below bits are seems to be missed. We will test it and send a patch. diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c index a4e761902093..d33f55b7ee98 100644 --- a/arch/s390/mm/pgalloc.c +++ b/arch/s390/mm/pgalloc.c @@ -88,12 +88,14 @@ int crst_table_upgrade(struct mm_struct *mm, unsigned long end) if (unlikely(!p4d)) goto err_p4d; crst_table_init(p4d, _REGION2_ENTRY_EMPTY); + pagetable_p4d_ctor(virt_to_ptdesc(p4d)); } if (end > _REGION1_SIZE) { pgd = crst_table_alloc(mm); if (unlikely(!pgd)) goto err_pgd; crst_table_init(pgd, _REGION1_ENTRY_EMPTY); + pagetable_pgd_ctor(virt_to_ptdesc(pgd)); } spin_lock_bh(&mm->page_table_lock); @@ -130,6 +132,7 @@ int crst_table_upgrade(struct mm_struct *mm, unsigned long end) return 0; err_pgd: + pagetable_dtor(virt_to_ptdesc(p4d)); crst_table_free(mm, p4d); err_p4d: return -ENOMEM; Thanks!