On Mon, Aug 05, 2019 at 12:33:45PM +0800, Jason Wang wrote: > > On 2019/8/2 下午10:03, Michael S. Tsirkin wrote: > > On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > Btw, I come up another idea, that is to disable preemption when vhost thread > > > need to access the memory. Then register preempt notifier and if vhost > > > thread is preempted, we're sure no one will access the memory and can do the > > > cleanup. > > Great, more notifiers :( > > > > Maybe can live with > > 1- disable preemption while using the cached pointer > > 2- teach vhost to recover from memory access failures, > > by switching to regular from/to user path > > > I don't get this, I believe we want to recover from regular from/to user > path, isn't it? That (disable copy to/from user completely) would be a nice to have since it would reduce the attack surface of the driver, but e.g. your code already doesn't do that. > > > > > So if you want to try that, fine since it's a step in > > the right direction. > > > > But I think fundamentally it's not what we want to do long term. > > > Yes. > > > > > > It's always been a fundamental problem with this patch series that only > > metadata is accessed through a direct pointer. > > > > The difference in ways you handle metadata and data is what is > > now coming and messing everything up. > > > I do propose soemthing like this in the past: > https://www.spinics.net/lists/linux-virtualization/msg36824.html. But looks > like you have some concern about its locality. Right and it doesn't go away. You'll need to come up with a test that messes it up and triggers a worst-case scenario, so we can measure how bad is that worst-case. > But the problem still there, GUP can do page fault, so still need to > synchronize it with MMU notifiers. I think the idea was, if GUP would need a pagefault, don't do a GUP and do to/from user instead. Hopefully that will fault the page in and the next access will go through. > The solution might be something like > moving GUP to a dedicated kind of vhost work. Right, generally GUP. > > > > > So if continuing the direct map approach, > > what is needed is a cache of mapped VM memory, then on a cache miss > > we'd queue work along the lines of 1-2 above. > > > > That's one direction to take. Another one is to give up on that and > > write our own version of uaccess macros. Add a "high security" flag to > > the vhost module and if not active use these for userspace memory > > access. > > > Or using SET_BACKEND_FEATURES? No, I don't think it's considered best practice to allow unpriveledged userspace control over whether kernel enables security features. > But do you mean permanent GUP as I did in > original RFC https://lkml.org/lkml/2018/12/13/218? > > Thanks Permanent GUP breaks THP and NUMA. > > > >