Peter Xu <peterx@xxxxxxxxxx> writes: [...] >> >>> Meanwhile it'll be great to also mention about why trylock is needed and no >> >>> further attempt to use lock_page(). The comment in prepare() previously >> >>> was great but unfortunately that code clip was removed. >> >> >> >> Will add. >> >> >> >>> In short, do you think something like this might be clearer? >> >> >> >> I think it's important to mention the optimisation, otherwise the >> >> temptation is to remove the installation of migration entries here and >> >> rely on try_to_migrate() to do it later. I would actually like to be >> >> able to do that because it simplifies the code in many ways but based on >> >> my testing the optimisation turns out to be very worth while. >> >> >> >>> /* >> >>> * We rely on the trylock() to migrate the pte. If this >> >>> * fails, we'll fail the migration of this page. IOW, the >> >>> * migration is very much best-effort, just like we'll also >> >>> * bail out if we found page pinned by other users after >> >>> * page being locked. >> >> >> >> Honestly I think this describes what the code does rather than why and >> >> is likely to become outdated and confusing. IMHO it's quite clear from >> >> the code that the migration will fail here if we can't lock the page. >> > >> > If you see that's what I was struggling to understand previously, so not >> > clear at least to me. :) Since normally a function like page migration >> > should (from the gut feeling) not rely on trylock only. >> >> IIRC, ordinary page migration will also only trylock. See >> isolate_movable_page(). > > Fair enough. > > I think it depends on what we want to do with the migration. I think not > even all users of isolate_movable_page() only depend on trylock, and it can > retry with lock_page later. E.g.: > > - soft_offline_in_use_page > - isolate_page > - isolate_movable_page <----- trylock here > - (if isolate_page fails...) migrate_pages > - unmap_and_move <----- lock_page here after 2 unsuccessful rounds > > And I also agree migration may always fail (e.g. by spurious page refcounts > being taken), so feel free to ignore the "gut feeling" - that's indeed kind > of subjective. So how about the following: /* * We rely on trylock_page() to avoid deadlock between * concurrent migrations where each is waiting on the others * page lock. If we can't immediately lock the page we fail this * migration as it is only best effort anyway. * * If we can lock the page it's safe to set up a migration entry * now. In the common case where the page is mapped once in a * single process setting up the migration entry now is an * optimisation to avoid walking the rmap later with * try_to_migrate(). */ Thanks. - Alistair