> On Jan 17, 2018, at 10:23 AM, Christopher Lameter <cl@xxxxxxxxx> wrote: > > On Tue, 16 Jan 2018, Mel Gorman wrote: > >> My main source of discomfort is the fact that this is permanent as two >> processes perfectly isolated but with a suitably shared COW mapping >> will never migrate the data. A potential improvement to get the reported >> bandwidth up in the test program would be to skip the rest of the VMA if >> page_mapcount != 1 in a COW mapping as it would be reasonable to assume >> the remaining pages in the VMA are also affected and the scan is wasteful. >> There are counter-examples to this but I suspect that the full VMA being >> shared is the common case. Whether you do that or not; > > Same concern here. Typically CAP_SYS_NICE will bypass the check that the > page is only mapped to a single process and the check looks exactly like > the ones for manual migration. Using CAP_SYS_NICE would be surprising > here since autonuma is not triggered by the currently running process. > > Can we configure this somehow via sysfs? If I understand the code correctly, CAP_SYS_NICE allows MPOL_MF_MOVE_ALL to be set with mbind() or used with move_pages(). CAP_SYS_NICE also causes migrate_pages() to behave as if MPOL_MF_MOVE_ALL were specified. There are checks requiring either MPOL_MF_MOVE_ALL or page_mapcount(page) == 1. The normal case does not call change_prot_numa(). change_prot_numa() is only called when MPOL_MF_LAZY is specified, and at the moment MPOL_MF_LAZY is not recognized as a valid flag. It looks to me that as things stand now, change_prot_numa() is only called from task_numa_work(). If MPOL_MF_LAZY were allowed and specified things would not work correctly. change_pte_range() is unaware of and can’t honor the difference between MPOL_MF_MOVE_ALL and MPOL_MF_MOVE. For the case of auto numa balancing, it may be undesirable for shared pages to be migrated whether they are also copy-on-write or not. The copy-on-write test was added to restrict the effect of the patch to the specific situation we observed. Perhaps I should remove it, I don’t understand why it would be desirable to modify the behavior via sysfs. Thanks, Henry > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href