On Mon, Apr 15, 2024 at 09:14:03PM +0200, David Hildenbrand wrote: > > > +retry: > > > + rc = walk_page_range_vma(vma, addr, vma->vm_end, > > > + &find_zeropage_ops, &addr); > > > + if (rc <= 0) > > > + continue; > > > > So in case an error is returned for the last vma, __s390_unshare_zeropage() > > finishes with that error. By contrast, the error for a non-last vma would > > be ignored? > > Right, it looks a bit off. walk_page_range_vma() shouldn't fail > unless find_zeropage_pte_entry() would fail -- which would also be > very unexpected. > > To handle it cleanly in case we would ever get a weird zeropage where we > don't expect it, we should probably just exit early. > > Something like the following (not compiled, addressing the comment below): > @@ -2618,7 +2618,8 @@ static int __s390_unshare_zeropages(struct mm_struct *mm) > struct vm_area_struct *vma; > VMA_ITERATOR(vmi, mm, 0); > unsigned long addr; > - int rc; > + vm_fault_t rc; > + int zero_page; I would use "fault" for mm faults (just like everywhere else handle_mm_fault() is called) and leave rc as is: vm_fault_t fault; int rc; > for_each_vma(vmi, vma) { > /* > @@ -2631,9 +2632,11 @@ static int __s390_unshare_zeropages(struct mm_struct *mm) > addr = vma->vm_start; > retry: > - rc = walk_page_range_vma(vma, addr, vma->vm_end, > - &find_zeropage_ops, &addr); > - if (rc <= 0) > + zero_page = walk_page_range_vma(vma, addr, vma->vm_end, > + &find_zeropage_ops, &addr); > + if (zero_page < 0) > + return zero_page; > + else if (!zero_page) > continue; > /* addr was updated by find_zeropage_pte_entry() */ > @@ -2656,7 +2659,7 @@ static int __s390_unshare_zeropages(struct mm_struct *mm) > goto retry; > } > - return rc; > + return 0; > } > static int __s390_disable_cow_sharing(struct mm_struct *mm) ... > > > + /* addr was updated by find_zeropage_pte_entry() */ > > > + rc = handle_mm_fault(vma, addr, > > > + FAULT_FLAG_UNSHARE | FAULT_FLAG_REMOTE, > > > + NULL); > > > + if (rc & VM_FAULT_OOM) > > > + return -ENOMEM; > > > > Heiko pointed out that rc type is inconsistent vs vm_fault_t returned by > > Right, let's use another variable for that. > > > handle_mm_fault(). While fixing it up, I've got concerned whether is it > > fine to continue in case any other error is met (including possible future > > VM_FAULT_xxxx)? > > Such future changes would similarly break break_ksm(). Staring at it, I do wonder > if break_ksm() should be handling VM_FAULT_HWPOISON ... very likely we should > handle it and fail -- we might get an MC while copying from the source page. > > VM_FAULT_HWPOISON on the shared zeropage would imply a lot of trouble, so > I'm not concerned about that for the case here, but handling it in the future > would be cleaner. > > Note that we always retry the lookup, so we won't just skip a zeropage on unexpected > errors. > > We could piggy-back on vm_fault_to_errno(). We could use > vm_fault_to_errno(rc, FOLL_HWPOISON), and only continue (retry) if the rc is 0 or > -EFAULT, otherwise fail with the returned error. > > But I'd do that as a follow up, and also use it in break_ksm() in the same fashion. @Christian, do you agree with this suggestion? Thanks!