On 15/05/18 14:11, Matthew Wilcox wrote: > On Tue, May 15, 2018 at 01:43:23PM +0300, Boaz Harrosh wrote: >> On 15/05/18 03:41, Matthew Wilcox wrote: >>> On Mon, May 14, 2018 at 10:37:38PM +0300, Boaz Harrosh wrote: >>>> On 14/05/18 22:15, Matthew Wilcox wrote: >>>>> On Mon, May 14, 2018 at 08:28:01PM +0300, Boaz Harrosh wrote: >>>>>> On a call to mmap an mmap provider (like an FS) can put >>>>>> this flag on vma->vm_flags. >>>>>> >>>>>> The VM_LOCAL_CPU flag tells the Kernel that the vma will be used >>>>>> from a single-core only, and therefore invalidation (flush_tlb) of >>>>>> PTE(s) need not be a wide CPU scheduling. >>>>> >>>>> I still don't get this. You're opening the kernel up to being exploited >>>>> by any application which can persuade it to set this flag on a VMA. >>>>> >>>> >>>> No No this is not an application accessible flag this can only be set >>>> by the mmap implementor at ->mmap() time (Say same as VM_VM_MIXEDMAP). >>>> >>>> Please see the zuf patches for usage (Again apologise for pushing before >>>> a user) >>>> >>>> The mmap provider has all the facilities to know that this can not be >>>> abused, not even by a trusted Server. >>> >>> I don't think page tables work the way you think they work. >>> >>> + err = vm_insert_pfn_prot(zt->vma, zt_addr, pfn, prot); >>> >>> That doesn't just insert it into the local CPU's page table. Any CPU >>> which directly accesses or even prefetches that address will also get >>> the translation into its cache. >> >> Yes I know, but that is exactly the point of this flag. I know that this >> address is only ever accessed from a single core. Because it is an mmap (vma) >> of an O_TMPFILE-exclusive file created in a core-pinned thread and I allow >> only that thread any kind of access to this vma. Both the filehandle and the >> mmaped pointer are kept on the thread stack and have no access from outside. >> >> So the all point of this flag is the kernel driver telling mm that this >> address is enforced to only be accessed from one core-pinned thread. > > You're still thinking about this from the wrong perspective. If you > were writing a program to attack this facility, how would you do it? > It's not exactly hard to leak one pointer's worth of information. > That would be very hard. Because that program would: - need to be root - need to start and pretend it is zus Server with the all mount thread thing, register new filesystem, grab some pmem devices. - Mount the said filesystem on said pmem. Create core-pinned ZT threads for all CPUs, start accepting IO. - And only then it can start leaking the pointer and do bad things. The bad things it can do to the application, not to the Kernel. And as a full filesystem it can do those bad things to the application through the front door directly not needing the mismatch tlb at all. That said. It brings up a very important point that I wanted to talk about. In this design the zuf(Kernel) and the zus(um Server) are part of the distribution. I would like to have the zus module be signed by the distro's Kernel's key and checked on loadtime. I know there is an effort by Redhat guys to try and sign all /sbin/* servers and have Kernel check these. So this is not the first time people have thought about that. Thanks Boaz