On Tue 24-03-20 11:47:20, Pingfan Liu wrote: > On Mon, Mar 23, 2020 at 3:34 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Sun 22-03-20 21:57:07, Pingfan Liu wrote: > > > For zone_device, migration can only happen on is_device_private_page(page). > > > Correct the logic in try_to_unmap_one(). > > > > Maybe it is just me lacking knowledge in the zone_device ZOO. But > > this really deserves a much more detailed explanation IMHO. It seems > > a5430dda8a3a ("mm/migrate: support un-addressable ZONE_DEVICE page in > > migration") deliberately made the decision to allow unmapping these > > pages? Is the check just wrong, inncomplete? Why? > I am not quite sure about zone_device, but I will try to explain it later. > > But first of all, I think the code conflicts with the logic behind it. > If try_to_unmap_one() success to unmap a page, then it should kill the > pte, and return true. But the original code return true before the > code like "ptep_clear_flush()" > > Now, I try to say about !device_private zone device. (Please pardon > and correct me if I make a mistake) > memmap_init_zone_device() raises an extra _refcount on all zone > device. And private-device should lifts the count later, otherwise it > can not migrate. But I did not find the exact place yet. > > While this extra _refcount will block migration, it is not the whole > reason if a zone device page is mapped. > > If a zone device page is mapped, then I think the original code > happen to work due to it skip the call of page_remove_rmap(), and in > try_to_unmap(){ return !page_mapcount(page) ? true : false;}. OK, you made me look more closely. The lack of documentation and therefore the expected semantic doesn't really help. So we can only deduce from the existing code which is a recipe for cargo cult programming :/ The only difference betweena rmap_one returning true and false is that the VMA walk stops for the later and done() callback is not called. Does rmap_one success means the mapping for the vma has been torn down? No. As we can see for the munlock case which just shows hot vague the semantic of this callback might be. I believe the zone device path was just copying it. Is that wrong? Well, it is less optimal than necessary because the property we are checking is not VMA specific so all other VMAs (if there are any at all) will have the same to say. So the only last remaining point is the done() callback. That one is documented as a check. There is no note about potential side effects but the current implementation is really only a check so skipping it doesn't make any real difference. > > What is the real user visible problem here? > As explained, the original code happens to work, but it conflicts with > the logic. Your changelog should be explicit about this being a pure code refinement/cleanup without any functional changes. The rmap walk and callbacks would benefit from a proper documentation. Hint... -- Michal Hocko SUSE Labs