"Yin, Fengwei" <fengwei.yin@xxxxxxxxx> writes: > On 11/13/2023 10:02 AM, Huang, Ying wrote: >>>> There are other places in the kernel where the PTE is cleared, for >>>> example, move_ptes() in mremap.c. IIUC, we need to audit all them. >>>> >>>> Another possible solution is to check PTE again with PTL held before >>>> reading in file data. This will increase the overhead of major fault >>>> path. Is it acceptable? >>> What if we check the PTE without page table lock acquired? >> The PTE is zeroed temporarily only with PTL held. So, if we acquire the >> PTL in filemap_fault() and check the PTE, the PTE which is zeroed in >> do_numa_page() will be non-zero now. So we can avoid the major fault. > Yes. > >> >> But, if we don't acquire the PTL, the PTE may still be zero. > For do_numa_page()/change_pte_range(), it does very limit thing during > PTE is cleared. Considering the code path of do_read_fault(), it's likely > the PTE is none-zero. It's possible per my understanding, although it doesn't feel good to depend on some "race" condition. > My concern to acquiring lock is that it brings extra PTL lock acquire/release > for other more common cases. Yes. It will bring some overhead to acquire the PTL. Anyway, some performance test is needed to compare the solution. -- Best Regards, Huang, Ying