On 12/4/19 11:01 AM, Felix Abecassis wrote: > Hello all, > Hi Felix, Thanks for writing up a very clear description of the problem. > On kernel 5.3, when using the move_pages syscall (wrapped by libnuma) and all > pages happen to be on the right node already, this function returns 0 but the > "status" array is not updated. This array potentially contains garbage values > (e.g. from malloc(3)), and I don't see a way to detect this. The way to detect this case would be to zero the array before calling move_pages(). Then, if move_pages() returns 0, and the array remains full of zeroes, you can conclude that move_pages() "succeeded", and that there were no errors for any of the pages. So the pages are where you requested them to end up. > > Looking at the kernel code, we are probably exiting do_pages_move here: > out_flush: > if (list_empty(&pagelist)) > return err; I looked at that code and the surrounding function, and it's been pretty much unchanged for quite a while. The above was last touched in April, 2018, for example. Yes, we could change the kernel code to fill in the array with zeroes in that situation, but the man page doesn't actually cover this case at all. We'd have to also change the man page, to say something like, "if pages were not moved because they were already in the requested location, then the status array will contain <SOME_VALUE> for such pages". In other words, the kernel matches the requirements (the man page) as it stands today, at least as I'm reading it. And given that one can already figure all this out with the existing kernel and libnuma behavior, I'm guessing that the linux-mm folks will not see any reason to make such a change--but maybe I'm guessing wrong. Anyone on CC want to weigh in there? thanks, -- John Hubbard NVIDIA