On 08.08.22 09:32, David Hildenbrand wrote: > Ever since the Dirty COW (CVE-2016-5195) security issue happened, we know > that FOLL_FORCE can be possibly dangerous, especially if there are races > that can be exploited by user space. > > Right now, it would be sufficient to have some code that sets a PTE of > a R/O-mapped shared page dirty, in order for it to erroneously become > writable by FOLL_FORCE. The implications of setting a write-protected PTE > dirty might not be immediately obvious to everyone. > > And in fact ever since commit 9ae0f87d009c ("mm/shmem: unconditionally set > pte dirty in mfill_atomic_install_pte"), we can use UFFDIO_CONTINUE to map > a shmem page R/O while marking the pte dirty. This can be used by > unprivileged user space to modify tmpfs/shmem file content even if the user > does not have write permissions to the file -- Dirty COW restricted to > tmpfs/shmem (CVE-2022-2590). > > To fix such security issues for good, the insight is that we really only > need that fancy retry logic (FOLL_COW) for COW mappings that are not > writable (!VM_WRITE). And in a COW mapping, we really only broke COW if > we have an exclusive anonymous page mapped. If we have something else > mapped, or the mapped anonymous page might be shared (!PageAnonExclusive), > we have to trigger a write fault to break COW. If we don't find an > exclusive anonymous page when we retry, we have to trigger COW breaking > once again because something intervened. > > Let's move away from this mandatory-retry + dirty handling and rely on > our PageAnonExclusive() flag for making a similar decision, to use the > same COW logic as in other kernel parts here as well. In case we stumble > over a PTE in a COW mapping that does not map an exclusive anonymous page, > COW was not properly broken and we have to trigger a fake write-fault to > break COW. > > Just like we do in can_change_pte_writable() added via > commit 64fe24a3e05e ("mm/mprotect: try avoiding write faults for exclusive > anonymous pages when changing protection") and commit 76aefad628aa > ("mm/mprotect: fix soft-dirty check in can_change_pte_writable()"), take > care of softdirty and uffd-wp manually. > > For example, a write() via /proc/self/mem to a uffd-wp-protected range has > to fail instead of silently granting write access and bypassing the > userspace fault handler. Note that FOLL_FORCE is not only used for debug > access, but also triggered by applications without debug intentions, for > example, when pinning pages via RDMA. > > This fixes CVE-2022-2590. Note that only x86_64 and aarch64 are > affected, because only those support CONFIG_HAVE_ARCH_USERFAULTFD_MINOR. > > Fortunately, FOLL_COW is no longer required to handle FOLL_FORCE. So > let's just get rid of it. I have to add here: "Thanks to Nadav Amit for pointing out that the pte_dirty() check in FOLL_FORCE code is problematic and might be exploitable." -- Thanks, David / dhildenb