Hugh Dickins <hughd@xxxxxxxxxx> writes: > On Tue, 28 Feb 2023, Huang, Ying wrote: >> Hugh Dickins <hughd@xxxxxxxxxx> writes: >> > On Fri, 24 Feb 2023, Huang Ying wrote: >> >> @@ -1247,7 +1236,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page >> >> /* Establish migration ptes */ >> >> VM_BUG_ON_FOLIO(folio_test_anon(src) && >> >> !folio_test_ksm(src) && !anon_vma, src); >> >> - try_to_migrate(src, TTU_BATCH_FLUSH); >> >> + try_to_migrate(src, mode == MIGRATE_ASYNC ? TTU_BATCH_FLUSH : 0); >> > >> > Why that change, I wonder? The TTU_BATCH_FLUSH can still be useful for >> > gathering multiple cross-CPU TLB flushes into one, even when it's only >> > a single page in the batch. >> >> Firstly, I would have thought that we have no opportunities to batch the >> TLB flushing now. But as you pointed out, it is still possible to batch >> if mapcount > 1. Secondly, without TTU_BATCH_FLUSH, we may flush the >> TLB for a single page (with invlpg instruction), otherwise, we will >> flush the TLB for all pages. The former is faster and will not >> influence other TLB entries of the process. >> >> Or we use TTU_BATCH_FLUSH only if mapcount > 1? > > I had not thought at all of the "invlpg" advantage (which I imagine > some other architectures than x86 share) to not delaying the TLB flush > of a single PTE. > > Frankly, I just don't have any feeling for the tradeoff between > multiple remote invlpgs versus one remote batched TLB flush of all. > Which presumably depends on number of CPUs, size of TLBs, etc etc. > > Your "mapcount > 1" idea might be good, but I cannot tell: I'd say > now that there's no reason to change your "mode == MIGRATE_ASYNC ? > TTU_BATCH_FLUSH : 0" without much more thought, or a quick insight > from someone else. Some other time maybe. Yes. I think that this is reasonable. We can revisit this later. Best Regards, Huang, Ying