Re: [RFC PATCH 5/6] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost

Oleg Nesterov <oleg@xxxxxxxxxx> · Sun, 3 Jul 2016 17:18:29 +0200

On 07/03, Michael S. Tsirkin wrote:
>
> On Sun, Jul 03, 2016 at 03:47:19PM +0200, Oleg Nesterov wrote:
> > On 07/01, Michal Hocko wrote:
> > >
> > > From: Michal Hocko <mhocko@xxxxxxxx>
> > >
> > > vhost driver relies on copy_from_user/get_user from a kernel thread.
> > > This makes it impossible to reap the memory of an oom victim which
> > > shares mm with the vhost kernel thread because it could see a zero
> > > page unexpectedly and theoretically make an incorrect decision visible
> > > outside of the killed task context.
> >
> > And I still can't understand how, but let me repeat that I don't understand
> > this code at all.
> >
> > > To quote Michael S. Tsirkin:
> > > : Getting an error from __get_user and friends is handled gracefully.
> > > : Getting zero instead of a real value will cause userspace
> > > : memory corruption.
> >
> > Which userspace memory corruption? We are going to kill the dev->mm owner,
> > the task which did ioctl(VHOST_SET_OWNER) and (at first glance) the task
> > who communicates with the callbacks fired by vhost_worker().
> >
> > Michael, could you please spell why should we care?
>
> I am concerned that
> - oom victim is sharing memory with another task
> - getting incorrect value from ring read makes vhost
>   change that shared memory

Well, we are going to kill all tasks which share this memory. I mean, ->mm.
If "sharing memory with another task" means, say, a file, then this memory
won't be unmapped (if shared).

So let me ask again... Suppose, say, QEMU does VHOST_SET_OWNER and then we
unmap its (anonymous/non-shared) memory. Who else's memory can be corrupted?

Sorry, I simply do not know what vhost does, quite possibly a stupid question.

> Having said all that, how about we just add some kind of per-mm
> notifier list, and let vhost know that owner is going away so
> it should stop looking at memory?
>
> Seems cleaner than looking at flags at each memory access,
> since vhost has its own locking.

Agreed... although of course I do not understand how this should work. But
looks better in any case..

Or perhaps we can change oom_kill_process() to send SIGKILL to kthreads as
well, this should not have any effect unless kthread does allow_signal(SIGKILL),
then we can change vhost_worker() to catch SIGKILL and react somehow. Not sure
this is really possible.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>