On Fri 12-08-16 15:21:41, Oleg Nesterov wrote: [...] > Whats really interesting is that I still fail to understand do we really > need this hack, iiuc you are not sure too, and Michael didn't bother to > explain why a bogus zero from anon memory is worse than other problems > caused by SIGKKILL from oom-kill.c. OK, so I've extended the changelog to clarify this some more, hopefully. " vhost driver relies on copy_from_user/get_user from a kernel thread. This makes it impossible to reap the memory of an oom victim which shares the mm with the vhost kernel thread because it could see a zero page unexpectedly and theoretically make an incorrect decision visible outside of the killed task context. To quote Michael S. Tsirkin: : Getting an error from __get_user and friends is handled gracefully. : Getting zero instead of a real value will cause userspace : memory corruption. The vhost kernel thread is bound to an open fd of the vhost device which is not tight to the mm owner life cycle in theory. The fd can be inherited or passed over to another process which means that we really have to be careful about unexpected memory corruption because unlike for normal oom victims the result will be visible outside of the oom victim context. Make sure that each place which can read from userspace is annotated properly and it uses copy_from_user_mm, __get_user_mm resp. copy_from_iter_mm. Each will get the target mm as an argument and it performs a pessimistic check to rule out that the oom_reaper could possibly unmap the particular page. __oom_reap_task then just needs to mark the mm as unstable before it unmaps any page. An alternative approach would require to hook into the page fault path and trigger EFAULT path from there but I do not like to add any code to all users while there is a single use_mm() consumer which suffers from this problem. This is a preparatory patch without any functional changes because the oom reaper doesn't touch mm shared with kthreads yet. " Does it help? Are there any other concerns or I can repost the series and ask Andrew to pick it for mmotm tree? -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>