On Mon, Sep 21, 2020 at 04:53:38PM -0700, John Hubbard wrote: > On 9/21/20 2:17 PM, Peter Xu wrote: > > (Commit message collected from Jason Gunthorpe) > > > > Reduce the chance of false positive from page_maybe_dma_pinned() by keeping > > Not yet, it doesn't. :) More: > > > track if the mm_struct has ever been used with pin_user_pages(). mm_structs > > that have never been passed to pin_user_pages() cannot have a positive > > page_maybe_dma_pinned() by definition. This allows cases that might drive up > > the page ref_count to avoid any penalty from handling dma_pinned pages. > > > > Due to complexities with unpining this trivial version is a permanent sticky > > bit, future work will be needed to make this a counter. > > How about this instead: > > Subsequent patches intend to reduce the chance of false positives from > page_maybe_dma_pinned(), by also considering whether or not a page has > even been part of an mm struct that has ever had pin_user_pages*() > applied to any of its pages. > > In order to allow that, provide a boolean value (even though it's not > implemented exactly as a boolean type) within the mm struct, that is > simply set once and never cleared. This will suffice for an early, rough > implementation that fixes a few problems. > > Future work is planned, to provide a more sophisticated solution, likely > involving a counter, and *not* involving something that is set and never > cleared. This looks good, thanks. Though I think Jason's version is good too (as long as we remove the confusing sentence, that's the one starting with "mm_structs that have never been passed... "). Before I drop Jason's version, I think I'd better figure out what's the major thing we missed so that maybe we can add another paragraph. E.g., "future work will be needed to make this a counter" already means "involving a counter, and *not* involving something that is set and never cleared" to me... Because otherwise it won't be called a counter.. > > > > > Suggested-by: Jason Gunthorpe <jgg@xxxxxxxx> > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > > --- > > include/linux/mm_types.h | 10 ++++++++++ > > kernel/fork.c | 1 + > > mm/gup.c | 6 ++++++ > > 3 files changed, 17 insertions(+) > > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > index 496c3ff97cce..6f291f8b74c6 100644 > > --- a/include/linux/mm_types.h > > +++ b/include/linux/mm_types.h > > @@ -441,6 +441,16 @@ struct mm_struct { > > #endif > > int map_count; /* number of VMAs */ > > + /** > > + * @has_pinned: Whether this mm has pinned any pages. This can > > + * be either replaced in the future by @pinned_vm when it > > + * becomes stable, or grow into a counter on its own. We're > > + * aggresive on this bit now - even if the pinned pages were > > + * unpinned later on, we'll still keep this bit set for the > > + * lifecycle of this mm just for simplicity. > > + */ > > + int has_pinned; > > I think this would be elegant as an atomic_t, and using atomic_set() and > atomic_read(), which seem even more self-documenting that what you have here. > > But it's admittedly a cosmetic point, combined with my perennial fear that > I'm missing something when I look at a READ_ONCE()/WRITE_ONCE() pair. :) Yeah but I hope I'm using it right.. :) I used READ_ONCE/WRITE_ONCE explicitly because I think they're cheaper than atomic operations, (which will, iiuc, lock the bus). > > It's completely OK to just ignore this comment, but I didn't want to completely > miss the opportunity to make it a tiny bit cleaner to the reader. This can always become an atomic in the future, or am I wrong? Actually if we're going to the counter way I feel like it's a must. Thanks, -- Peter Xu