On Tue, May 19, 2020 at 5:11 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, May 19, 2020 at 12:03:06AM -0700, Dan Williams wrote: > > Close the hole of holding a mapping over kernel driver takeover event of > > a given address range. > > > > Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges") > > introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the > > kernel against scenarios where a /dev/mem user tramples memory that a > > kernel driver owns. However, this protection only prevents *new* read(), > > write() and mmap() requests. Established mappings prior to the driver > > calling request_mem_region() are left alone. > > > > Especially with persistent memory, and the core kernel metadata that is > > stored there, there are plentiful scenarios for a /dev/mem user to > > violate the expectations of the driver and cause amplified damage. > > > > Teach request_mem_region() to find and shoot down active /dev/mem > > mappings that it believes it has successfully claimed for the exclusive > > use of the driver. Effectively a driver call to request_mem_region() > > becomes a hole-punch on the /dev/mem device. > > > > The typical usage of unmap_mapping_range() is part of > > truncate_pagecache() to punch a hole in a file, but in this case the > > implementation is only doing the "first half" of a hole punch. Namely it > > is just evacuating current established mappings of the "hole", and it > > relies on the fact that /dev/mem establishes mappings in terms of > > absolute physical address offsets. Once existing mmap users are > > invalidated they can attempt to re-establish the mapping, or attempt to > > continue issuing read(2) / write(2) to the invalidated extent, but they > > will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can > > block those subsequent accesses. > > > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > > Cc: Russell King <linux@xxxxxxxxxxxxxxxx> > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > > Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges") > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > --- > > Changes since v1 [1]: > > > > - updated the changelog to describe the usage of unmap_mapping_range(). > > No other logic changes: > > > > [1]: http://lore.kernel.org/r/158662721802.1893045.12301414116114602646.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > > > Greg, Andrew, > > > > I have a regression test for this case now. This was found by an > > intermittent data corruption scenario on pmem from a test tool using > > /dev/mem. > > Ick, why are test tools messing around in /dev/mem :) Yeah, I'm all for useful tools, just not at the expense of kernel integrity. > Anyway, this seems sane to me, want me to take it through my tree? Yes please, seems to belong with the driver core. Thanks!