Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 22, 2020 at 01:10:46PM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 22, 2020 at 11:17:36AM -0400, Peter Xu wrote:
> 
> > > But it's admittedly a cosmetic point, combined with my perennial fear that
> > > I'm missing something when I look at a READ_ONCE()/WRITE_ONCE() pair. :)
> > 
> > Yeah but I hope I'm using it right.. :) I used READ_ONCE/WRITE_ONCE explicitly
> > because I think they're cheaper than atomic operations, (which will, iiuc, lock
> > the bus).
> 
> It is worth thinking a bit about racing fork with
> pin_user_pages(). The desired outcome is:
> 
>   If fork wins the page is write protected, and pin_user_pages_fast()
>   will COW it.
> 
>   If pin_user_pages_fast() wins then fork must see the READ_ONCE and
>   the pin.
> 
> As get_user_pages_fast() is lockless it looks like the ordering has to
> be like this:
> 
>   pin_user_pages_fast()                   fork()
>    atomic_set(has_pinned, 1);
>    [..]
>    atomic_add(page->_refcount)
>    ordered check write protect()
>                                           ordered set write protect()
>                                           atomic_read(page->_refcount)
>                                           atomic_read(has_pinned)
> 
> Such that in all the degenerate racy cases the outcome is that both
> sides COW, never neither.
> 
> Thus I think it does have to be atomics purely from an ordering
> perspective, observing an increased _refcount requires that has_pinned
> != 0 if we are pinning.
> 
> So, to make this 100% this ordering will need to be touched up too.

Thanks for spotting this.  So something like below should work, right?

diff --git a/mm/memory.c b/mm/memory.c
index 8f3521be80ca..6591f3f33299 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -888,8 +888,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
                 * Because we'll need to release the locks before doing cow,
                 * pass this work to upper layer.
                 */
-               if (READ_ONCE(src_mm->has_pinned) && wp &&
-                   page_maybe_dma_pinned(page)) {
+               if (wp && page_maybe_dma_pinned(page) &&
+                   READ_ONCE(src_mm->has_pinned)) {
                        /* We've got the page already; we're safe */
                        data->cow_old_page = page;
                        data->cow_oldpte = *src_pte;

I can also add some more comment to emphasize this.

I think the WRITE_ONCE/READ_ONCE can actually be kept, because atomic ops
should contain proper memory barriers already so the memory access orders
should be guaranteed (e.g., atomic_add() will have an implicit wmb(); rmb() for
the other side).  However maybe it's even simpler to change has_pinned into
atomic as John suggested.  Thanks,

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux