On 9/18/19 4:41 PM, Kirill A. Shutemov wrote:
On Wed, Sep 18, 2019 at 02:59:08PM +0200, Thomas Hellström (VMware) wrote:
From: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space.
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.
The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.
After quick glance, it looks a lot as rmap code duplication. Why not
extend rmap_walk() interface instead to cover range of pages?
There appears to exist quite a few pagetable walks in the mm code. "Take
1" of this patch series modified the "apply_to_page_range" interface and
used that. But the interface modification was actually what eventually
caused Linus to reject the code. While it is entirely possible to do a
proper modification following Linus' and Christoph's guidelines, that
code doesn't allow for huge pages and populates all page table levels.
We will soon probably want to support huge pages and do not want to
populate. The number of altered code-paths itself IMO motivates yet
another pagetable walk implementation.
The walk code currently resembling the present patch the most is the
unmap_mapping_range() implementation.
The rmap_walk() is not very well suited since it operates on a struct
page and the code of this patch has no notion of struct pages.
So my thoughts on this is that the interface should in time move towards
the code in mm/pagewalk.c. If we eventually have more users of an
address-space pagewalk or want to re-implement unmap_mapping_range()
using a generic pagewalk, we should move the walk to pagewalk.c and
reuse its structures, but implement separate code for the walk since we
can't split huge pages and we can't take the mmap_sem. Meanwhile we
should keep the code separate in as_dirty_helpers.c
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Huang Ying <ying.huang@xxxxxxxxx>
Cc: Souptick Joarder <jrdr.linux@xxxxxxxxx>
Cc: "Jérôme Glisse" <jglisse@xxxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Signed-off-by: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Reviewed-by: Ralph Campbell <rcampbell@xxxxxxxxxx> #v1
---
MAINTAINERS | 1 +
include/linux/mm.h | 13 +-
mm/Kconfig | 3 +
mm/Makefile | 1 +
mm/as_dirty_helpers.c | 392 ++++++++++++++++++++++++++++++++++++++++++
5 files changed, 409 insertions(+), 1 deletion(-)
create mode 100644 mm/as_dirty_helpers.c
diff --git a/MAINTAINERS b/MAINTAINERS
index c2d975da561f..b596c7cf4a85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5287,6 +5287,7 @@ T: git git://people.freedesktop.org/~thomash/linux
S: Supported
F: drivers/gpu/drm/vmwgfx/
F: include/uapi/drm/vmwgfx_drm.h
+F: mm/as_dirty_helpers.c
Emm.. No. Core MM functinality cannot belong to random driver.
OK. I'll put it under core MM.
/Thomas