(Commit message collected from Jason Gunthorpe) Reduce the chance of false positive from page_maybe_dma_pinned() by keeping track if the mm_struct has ever been used with pin_user_pages(). mm_structs that have never been passed to pin_user_pages() cannot have a positive page_maybe_dma_pinned() by definition. This allows cases that might drive up the page ref_count to avoid any penalty from handling dma_pinned pages. Due to complexities with unpining this trivial version is a permanent sticky bit, future work will be needed to make this a counter. Suggested-by: Jason Gunthorpe <jgg@xxxxxxxx> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> --- include/linux/mm_types.h | 10 ++++++++++ kernel/fork.c | 1 + mm/gup.c | 6 ++++++ 3 files changed, 17 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 496c3ff97cce..6f291f8b74c6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -441,6 +441,16 @@ struct mm_struct { #endif int map_count; /* number of VMAs */ + /** + * @has_pinned: Whether this mm has pinned any pages. This can + * be either replaced in the future by @pinned_vm when it + * becomes stable, or grow into a counter on its own. We're + * aggresive on this bit now - even if the pinned pages were + * unpinned later on, we'll still keep this bit set for the + * lifecycle of this mm just for simplicity. + */ + int has_pinned; + spinlock_t page_table_lock; /* Protects page tables and some * counters */ diff --git a/kernel/fork.c b/kernel/fork.c index 49677d668de4..7237d418e7b5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1011,6 +1011,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; + mm->has_pinned = 0; atomic64_set(&mm->pinned_vm, 0); memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); spin_lock_init(&mm->page_table_lock); diff --git a/mm/gup.c b/mm/gup.c index e5739a1974d5..2d9019bf1773 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1255,6 +1255,9 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm, BUG_ON(*locked != 1); } + if (flags & FOLL_PIN) + WRITE_ONCE(mm->has_pinned, 1); + /* * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior * is to set FOLL_GET if the caller wants pages[] filled in (but has @@ -2660,6 +2663,9 @@ static int internal_get_user_pages_fast(unsigned long start, int nr_pages, FOLL_FAST_ONLY))) return -EINVAL; + if (gup_flags & FOLL_PIN) + WRITE_ONCE(current->mm->has_pinned, 1); + if (!(gup_flags & FOLL_FAST_ONLY)) might_lock_read(¤t->mm->mmap_lock); -- 2.26.2