On Wed, Nov 23, 2022 at 03:28:27PM +0100, Daniel Vetter wrote: > > This patch is known to be broken in so many ways. It also has a major > > security hole that it ignores the PTE flags making the page > > RO. Ignoring the special bit is somehow not surprising :( > > > > This probably doesn't work, but is the general idea of what KVM needs > > to do: > > Oh dear, when I dug around in there I entirely missed that > kvm_try_get_pfn exists, and it's very broken indeed. kvm really needs > to grow a proper mmu notifier. > > Another thing I'm wondering right now, the follow_pte(); > fixup_user_fault(); follow_pte(); approach does not make any > guarantees of actually being right. If you're sufficiently unlucky you > might race against an immediate pte invalidate between the fixup and > the 2nd follow_pte(). But you can also not loop, because that would > fail to catch permanent faults. Yes, it is pretty broken. kvm already has support for mmu notifiers and uses it for other stuff. I can't remember what exactly this code path was for, IIRC Paolo talked about having a big rework/fix for it when we last talked about the missing write protect. I also vauagely recall he had some explanation why this might be safe. > I think the iommu fault drivers have a similar pattern. Where? It shouldn't The common code for SVA just calls handle_mm_fault() and restarts the PRI. Since the page table is physically shared there is no issue with a stale copy. > What am I missing here? Or is that also just broken. gup works around > this with the slow path that takes the mmap sem and walking the vma > tree, follow_pte/fixup_user_fautl users dont. follow_pte() is just fundamentally broken, things must not use it. > Maybe mmu notifier based restarting would help with this too, if > done properly. That is called hmm_range_fault() Jason