On Wed, 24 Jun 2020 16:34:12 -0400 Peter Xu <peterx@xxxxxxxxxx> wrote: > On Wed, Jun 24, 2020 at 08:49:03PM +0200, Gerald Schaefer wrote: > > On Fri, 19 Jun 2020 12:05:13 -0400 > > Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > [...] > > > > > @@ -4393,6 +4425,38 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > > > mem_cgroup_oom_synchronize(false); > > > } > > > > > > + if (ret & VM_FAULT_RETRY) > > > + return ret; > > > > I'm wondering if this also needs a check and exit for VM_FAULT_ERROR. > > In arch code (s390 and all others I briefly checked), the accounting > > was skipped for VM_FAULT_ERROR case. > > Yes. I didn't explicitly add the check because I thought it's still OK to count > the error cases, especially after we've discussed about > PERF_COUNT_SW_PAGE_FAULTS in v1. So far, the major reason (iiuc) to have > PERF_COUNT_SW_PAGE_FAULTS still in per-arch handlers is to also cover these > corner cases like VM_FAULT_ERROR. So to me it makes sense too to also count > them in here. But I agree it changes the old counting on most archs. Having PERF_COUNT_SW_PAGE_FAULTS count everything including VM_FAULT_ERROR is OK. Just major/minor accounting should be only about successes, IIRC from v1 discussion. The "new rules" also say + * - faults that never even got here (because the address + * wasn't valid). That includes arch_vma_access_permitted() + * failing above. VM_FAULT_ERROR, and also the arch-specific VM_FAULT_BADxxx, qualify as "address wasn't valid" I think, so they should not be counted as major/minor. IIRC from v1, and we want to only count success as major/minor, maybe the rule could also be made more clear about that, e.g. like + * - unsuccessful faults (because the address wasn't valid) + * do not count. That includes arch_vma_access_permitted() + * failing above. > > Again, I don't have strong opinion either on this, just like the same to > PERF_COUNT_SW_PAGE_FAULTS... But if no one disagree, I will change this to: > > if (ret & (VM_FAULT_RETRY | VM_FAULT_ERROR)) > return ret; > > So we try our best to follow the past. Sounds good to me, and VM_FAULT_BADxxx should never show up here. > > Btw, note that there will still be some even more special corner cases. E.g., > for ARM64 it's also not accounted for some ARM64 specific fault errors > (VM_FAULT_BADMAP, VM_FAULT_BADACCESS). So even if we don't count > VM_FAULT_ERROR, we might still count these for ARM64. We can try to redefine > VM_FAULT_ERROR in ARM64 to cover all the arch-specific errors, however that > seems an overkill to me sololy for fault accountings, so hopefully I can ignore > that difference. Hmm, arm64 already does not count the VM_FAULT_BADxxx, but also does not call handle_mm_fault() for those, so no change with this patch. arm (and also unicore32) do count those, but also not call handle_mm_fault(), so there would be the change that they lose accounting, IIUC. I agree that this probably can be ignored. The code in arm64 also looks more recent, so it's probably just a left-over in arm/unicore32 code. Regards, Gerald