On Tue, 23 May 2023 18:49:14 -0700 (PDT) Hugh Dickins <hughd@xxxxxxxxxx> wrote: > On Tue, 23 May 2023, Claudio Imbrenda wrote: > > > > so if I understand the above correctly, pte_offset_map_lock will only > > fail if the whole page table has disappeared, and in that case, it will > > never reappear with zero pages, therefore we can safely skip (in that > > case just break). if we were to do a continue instead of a break, we > > would most likely fail again anyway. > > Yes, that's the most likely; and you hold mmap_write_lock() there, > and VM_NOHUGEPAGE on all vmas, so I think it's the only foreseeable > possibility. > > > > > in that case I would still like a small change in your patch: please > > write a short (2~3 lines max) comment about why it's ok to do things > > that way > > Sure. > > But I now see that I've disobeyed you, and gone to 4 lines (but in the > comment above the function, so as not to distract from the code itself): > is this good wording to you? I needed to research how they were stopped > from coming in afterwards, so wanted to put something greppable in there. > > And, unless I'm misunderstanding, that "after THP was enabled" was > always supposed to say "after THP was disabled" (because splitting a > huge zero page pmd inserts a a page table full of little zero ptes). indeed, thanks for noticing and fixing it > > Or would you prefer the comment in the commit message instead, > or down just above the pte_offset_map_lock() line? > > It would much better if I could find one place at the mm end, to > enforce its end of the contract; but cannot think how to do that. > > Hugh > > --- a/arch/s390/mm/gmap.c > +++ b/arch/s390/mm/gmap.c > @@ -2537,7 +2537,12 @@ static inline void thp_split_mm(struct mm_struct *mm) > * Remove all empty zero pages from the mapping for lazy refaulting > * - This must be called after mm->context.has_pgste is set, to avoid > * future creation of zero pages > - * - This must be called after THP was enabled > + * - This must be called after THP was disabled. > + * > + * mm contracts with s390, that even if mm were to remove a page table, > + * racing with the loop below and so causing pte_offset_map_lock() to fail, > + * it will never insert a page table containing empty zero pages once > + * mm_forbids_zeropage(mm) i.e. mm->context.has_pgste is set. > */ > static int __zap_zero_pages(pmd_t *pmd, unsigned long start, > unsigned long end, struct mm_walk *walk) looks good, thanks