On Thu, Mar 07, 2019 at 09:21:03PM -0500, Michael S. Tsirkin wrote: > On Thu, Mar 07, 2019 at 02:17:20PM -0500, Jerome Glisse wrote: > > > It's because of all these issues that I preferred just accessing > > > userspace memory and handling faults. Unfortunately there does not > > > appear to exist an API that whitelists a specific driver along the lines > > > of "I checked this code for speculative info leaks, don't add barriers > > > on data path please". > > > > Maybe it would be better to explore adding such helper then remapping > > page into kernel address space ? > > I explored it a bit (see e.g. thread around: "__get_user slower than > get_user") and I can tell you it's not trivial given the issue is around > security. So in practice it does not seem fair to keep a significant > optimization out of kernel because *maybe* we can do it differently even > better :) Maybe a slightly different approach between this patchset and other copy user API would work here. What you want really is something like a temporary mlock on a range of memory so that it is safe for the kernel to access range of userspace virtual address ie page are present and with proper permission hence there can be no page fault while you are accessing thing from kernel context. So you can have like a range structure and mmu notifier. When you lock the range you block mmu notifier to allow your code to work on the userspace VA safely. Once you are done you unlock and let the mmu notifier go on. It is pretty much exactly this patchset except that you remove all the kernel vmap code. A nice thing about that is that you do not need to worry about calling set page dirty it will already be handle by the userspace VA pte. It also use less memory than when you have kernel vmap. This idea might be defeated by security feature where the kernel is running in its own address space without the userspace address space present. Anyway just wanted to put the idea forward. Cheers, Jérôme