On Sun, Jul 03, 2016 at 05:18:29PM +0200, Oleg Nesterov wrote: > On 07/03, Michael S. Tsirkin wrote: > > > > On Sun, Jul 03, 2016 at 03:47:19PM +0200, Oleg Nesterov wrote: > > > On 07/01, Michal Hocko wrote: > > > > > > > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > > > > > vhost driver relies on copy_from_user/get_user from a kernel thread. > > > > This makes it impossible to reap the memory of an oom victim which > > > > shares mm with the vhost kernel thread because it could see a zero > > > > page unexpectedly and theoretically make an incorrect decision visible > > > > outside of the killed task context. > > > > > > And I still can't understand how, but let me repeat that I don't understand > > > this code at all. > > > > > > > To quote Michael S. Tsirkin: > > > > : Getting an error from __get_user and friends is handled gracefully. > > > > : Getting zero instead of a real value will cause userspace > > > > : memory corruption. > > > > > > Which userspace memory corruption? We are going to kill the dev->mm owner, > > > the task which did ioctl(VHOST_SET_OWNER) and (at first glance) the task > > > who communicates with the callbacks fired by vhost_worker(). > > > > > > Michael, could you please spell why should we care? > > > > I am concerned that > > - oom victim is sharing memory with another task > > - getting incorrect value from ring read makes vhost > > change that shared memory > > Well, we are going to kill all tasks which share this memory. I mean, ->mm. > If "sharing memory with another task" means, say, a file, then this memory > won't be unmapped (if shared). > > So let me ask again... Suppose, say, QEMU does VHOST_SET_OWNER and then we > unmap its (anonymous/non-shared) memory. Who else's memory can be corrupted? As you say, I mean anyone who shares memory with QEMU through a file. IIUC current users that do this are all stateless so even if they crash this is not a big deal, but it seems wrong to assume this will be like this forever. > Sorry, I simply do not know what vhost does, quite possibly a stupid question. > > > Having said all that, how about we just add some kind of per-mm > > notifier list, and let vhost know that owner is going away so > > it should stop looking at memory? > > > > Seems cleaner than looking at flags at each memory access, > > since vhost has its own locking. > > Agreed... although of course I do not understand how this should work. Add a linked list of callbacks in in struct mm_struct. vhost would add itself there. In callback, set private_data for all vqs to NULL under vq mutex. > But > looks better in any case.. > > Or perhaps we can change oom_kill_process() to send SIGKILL to kthreads as > well, this should not have any effect unless kthread does allow_signal(SIGKILL), > then we can change vhost_worker() to catch SIGKILL and react somehow. Not sure > this is really possible. > > Oleg. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>