On Tue, May 19, 2020 at 12:03:06AM -0700, Dan Williams wrote: > Close the hole of holding a mapping over kernel driver takeover event of > a given address range. > > Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges") > introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the > kernel against scenarios where a /dev/mem user tramples memory that a > kernel driver owns. However, this protection only prevents *new* read(), > write() and mmap() requests. Established mappings prior to the driver > calling request_mem_region() are left alone. > > Especially with persistent memory, and the core kernel metadata that is > stored there, there are plentiful scenarios for a /dev/mem user to > violate the expectations of the driver and cause amplified damage. > > Teach request_mem_region() to find and shoot down active /dev/mem > mappings that it believes it has successfully claimed for the exclusive > use of the driver. Effectively a driver call to request_mem_region() > becomes a hole-punch on the /dev/mem device. > > The typical usage of unmap_mapping_range() is part of > truncate_pagecache() to punch a hole in a file, but in this case the > implementation is only doing the "first half" of a hole punch. Namely it > is just evacuating current established mappings of the "hole", and it > relies on the fact that /dev/mem establishes mappings in terms of > absolute physical address offsets. Once existing mmap users are > invalidated they can attempt to re-establish the mapping, or attempt to > continue issuing read(2) / write(2) to the invalidated extent, but they > will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can > block those subsequent accesses. > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Russell King <linux@xxxxxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges") > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > --- > Changes since v1 [1]: > > - updated the changelog to describe the usage of unmap_mapping_range(). > No other logic changes: > > [1]: http://lore.kernel.org/r/158662721802.1893045.12301414116114602646.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > Greg, Andrew, > > I have a regression test for this case now. This was found by an > intermittent data corruption scenario on pmem from a test tool using > /dev/mem. Ick, why are test tools messing around in /dev/mem :) Anyway, this seems sane to me, want me to take it through my tree? thanks, greg k-h