On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote: > Way back it was a reasonable assumptions that iomem mappings never > change the pfn range they point at. But this has changed: > > - gpu drivers dynamically manage their memory nowadays, invalidating > ptes with unmap_mapping_range when buffers get moved > > - contiguous dma allocations have moved from dedicated carvetouts to > cma regions. This means if we miss the unmap the pfn might contain > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) > > - even /dev/mem now invalidates mappings when the kernel requests that > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 > ("/dev/mem: Revoke mappings when a driver claims the region") > > Accessing pfns obtained from ptes without holding all the locks is > therefore no longer a good idea. > > Unfortunately there's some users where this is not fixable (like v4l > userptr of iomem mappings) or involves a pile of work (vfio type1 > iommu). For now annotate these as unsafe and splat appropriately. > > This patch adds an unsafe_follow_pfn, which later patches will then > roll out to all appropriate places. > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > Cc: Jason Gunthorpe <jgg@xxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: John Hubbard <jhubbard@xxxxxxxxxx> > Cc: Jérôme Glisse <jglisse@xxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: linux-mm@xxxxxxxxx > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > Cc: linux-samsung-soc@xxxxxxxxxxxxxxx > Cc: linux-media@xxxxxxxxxxxxxxx > Cc: kvm@xxxxxxxxxxxxxxx > --- > include/linux/mm.h | 2 ++ > mm/memory.c | 32 +++++++++++++++++++++++++++++++- > mm/nommu.c | 17 +++++++++++++++++ > security/Kconfig | 13 +++++++++++++ > 4 files changed, 63 insertions(+), 1 deletion(-) Makes sense to me. I wonder if we could change the original follow_pfn to require the ptep and then lockdep_assert_held() it against the page table lock? > +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, > + unsigned long *pfn) > +{ > +#ifdef CONFIG_STRICT_FOLLOW_PFN > + pr_info("unsafe follow_pfn usage rejected, see > CONFIG_STRICT_FOLLOW_PFN\n"); Wonder if we can print something useful here, like the current PID/process name? > diff --git a/security/Kconfig b/security/Kconfig > index 7561f6f99f1d..48945402e103 100644 > --- a/security/Kconfig > +++ b/security/Kconfig > @@ -230,6 +230,19 @@ config STATIC_USERMODEHELPER_PATH > If you wish for all usermode helper programs to be disabled, > specify an empty string here (i.e. ""). > > +config STRICT_FOLLOW_PFN > + bool "Disable unsafe use of follow_pfn" > + depends on MMU I would probably invert this CONFIG_ALLOW_UNSAFE_FOLLOW_PFN default n Jason