On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote: > > Peter Xu <peterx@xxxxxxxxxx> writes: > > > On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote: > >> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed > >> the way trylock_page() in migrate_vma_collect_pmd() works without > >> updating the comment. Reword the comment to be less misleading and a > >> better reflection of what happens. > >> > >> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx> > >> Reported-by: Peter Xu <peterx@xxxxxxxxxx> > >> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") > >> --- > >> mm/migrate_device.c | 8 +++++--- > >> 1 file changed, 5 insertions(+), 3 deletions(-) > >> > >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c > >> index 5052093d0262..0736f846de0b 100644 > >> --- a/mm/migrate_device.c > >> +++ b/mm/migrate_device.c > >> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > >> get_page(page); > >> > >> /* > >> - * Optimize for the common case where page is only mapped once > >> - * in one process. If we can lock the page, then we can safely > >> - * set up a special migration page table entry now. > >> + * If we can't lock the page we can't migrate it. If we can it's > >> + * safe to set up a migration entry now. In the common case > >> + * where the page is mapped once in a single process setting up > >> + * the migration entry now is an optimisation to avoid walking > >> + * the rmap later with try_to_migrate(). > >> */ > > > > IMHO the last sentence can still be a bit confusing - here we 100% rely on > > the trylock() to proceed or we'll stop migration right away. IMHO that > > means this is not an optimization, since optimizations should always be > > optional but not the case here. > > We have to lock the page here, we don't have to install the migration > entries. Installing the migration entries here is optional and is the > optimisation. I see what you mean now. > > > Meanwhile it'll be great to also mention about why trylock is needed and no > > further attempt to use lock_page(). The comment in prepare() previously > > was great but unfortunately that code clip was removed. > > Will add. > > > In short, do you think something like this might be clearer? > > I think it's important to mention the optimisation, otherwise the > temptation is to remove the installation of migration entries here and > rely on try_to_migrate() to do it later. I would actually like to be > able to do that because it simplifies the code in many ways but based on > my testing the optimisation turns out to be very worth while. > > > /* > > * We rely on the trylock() to migrate the pte. If this > > * fails, we'll fail the migration of this page. IOW, the > > * migration is very much best-effort, just like we'll also > > * bail out if we found page pinned by other users after > > * page being locked. > > Honestly I think this describes what the code does rather than why and > is likely to become outdated and confusing. IMHO it's quite clear from > the code that the migration will fail here if we can't lock the page. If you see that's what I was struggling to understand previously, so not clear at least to me. :) Since normally a function like page migration should (from the gut feeling) not rely on trylock only. > > And the fact that migration can fail is already covered in the existing > documentation. See for example the extensive comment on > migrate_vma_setup() about how the whole flow works. Eg: > > * Once migrate_vma_pages() returns the caller may inspect which pages were > * successfully migrated, and which were not. Successfully migrated pages will > * have the MIGRATE_PFN_MIGRATE flag set for their src array entry. IIUC trylock failure could be detected even earlier than that. E.g. the ppc code has: ret = migrate_vma_setup(&mig); if (ret) return -1; spage = migrate_pfn_to_page(*mig.src); if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE)) goto out_finalize; So afaict it won't even reach migrate_vma_pages() by missing of the flag MIGRATE_PFN_VALID in the src mpfn. Probably because the migrate_vma_*() API is extensive enough so it's hard to summarize its usage in a short paragraph. I'm probably asking too much.. No strong opinion on the rest as long as we can add back some explanation on why lock_page() is not used. Thanks again for doing this. -- Peter Xu