Hi, Amit, Thank you very much for review! Nadav Amit <namit@xxxxxxxxxx> writes: >> On Apr 10, 2023, at 12:52 AM, Huang Ying <ying.huang@xxxxxxxxx> wrote: >> >> 0Day/LKP reported a performance regression for commit >> 7e12beb8ca2a ("migrate_pages: batch flushing TLB"). In the commit, the >> TLB flushing during page migration is batched. So, in >> try_to_migrate_one(), ptep_clear_flush() is replaced with >> set_tlb_ubc_flush_pending(). In further investigation, it is found >> that the TLB flushing can be avoided in ptep_clear_flush() if the PTE >> is inaccessible. In fact, we can optimize in similar way for the >> batched TLB flushing too to improve the performance. >> >> So in this patch, we check pte_accessible() before >> set_tlb_ubc_flush_pending() in try_to_unmap/migrate_one(). Tests show >> that the benchmark score of the anon-cow-rand-mt test case of >> vm-scalability test suite can improve up to 2.1% with the patch on a >> Intel server machine. The TLB flushing IPI can reduce up to 44.3%. > > LGTM. Thanks! > I know it’s meaningless for x86 (but perhaps ARM would use this infra > too): do we need smp_mb__after_atomic() after ptep_get_and_clear() and > before pte_accessible()? Why do we need the memory barrier? IIUC, the PTL is locked, so PTE value will not be changed under us. Anything else? > In addition, if this goes into stable (based on the Fixes tag), consider > breaking it into 2 patches, when only one would be backported. The fixed commit (7e12beb8ca2a ("migrate_pages: batch flushing TLB")) is merged by v6.3-rc1. So this patch will only be backported to v6.3 and later. Is it OK? Best Regards, Huang, Ying