Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 9 Jul 2012, Will Deacon wrote:
> On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote:
> > On Wed 04-07-12 15:32:56, Will Deacon wrote:
> > > When allocating and returning clear huge pages to userspace as a
> > > response to a fault, we may zero and return a mapping to a previously
> > > dirtied physical region (for example, it may have been written by
> > > a private mapping which was freed as a result of an ftruncate on the
> > > backing file). On architectures with Harvard caches, this can lead to
> > > I/D inconsistency since the zeroed view may not be visible to the
> > > instruction stream.
> > > 
> > > This patch solves the problem by flushing the region after allocating
> > > and clearing a new huge page. Note that PowerPC avoids this issue by
> > > performing the flushing in their clear_user_page implementation to keep
> > > the loader happy, however this is closely tied to the semantics of the
> > > PG_arch_1 page flag which is architecture-specific.
> > > 
> > > Acked-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > > Signed-off-by: Will Deacon <will.deacon@xxxxxxx>
> > > ---
> > >  mm/hugetlb.c |    1 +
> > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index e198831..b83d026 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -2646,6 +2646,7 @@ retry:
> > >  			goto out;
> > >  		}
> > >  		clear_huge_page(page, address, pages_per_huge_page(h));
> > > +		flush_dcache_page(page);
> > >  		__SetPageUptodate(page);
> > 
> > Does this have to be explicit in the arch independent code?
> > It seems that ia64 uses flush_dcache_page already in the clear_user_page
> 
> It would match what is done in similar situations by cow_user_page (mm/memory.c)
> and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page
> flushing (DMA bounce, ksm) so I think this is the right place for it.

I am not at all sure if you are right or not:
please let's consult linux-arch about this - now Cc'ed.

If this hugetlb_no_page() were solely mapping the hugepage into that
userspace, I would say you are wrong.  It's the job of clear_huge_page()
to take the mapped address into account, and pass it down to the
architecture-specific implementation, to do whatever flushing is
needed - you should be providing that in your architecture.

In particular, notice how clear_huge_page() goes round a loop of
clear_user_highpage()s: in your patch, you're expecting the implementation
of flush_dcache_page() to notice whether or not this is a hugepage, and
flush the appropriate size.

Perhaps yours is the only architecture to need this on huge, and your
flush_dcache_page() implements it correctly; but it does seem surprising.

If I start to grep the architectures for non-empty flush_dcache_page(),
I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing
if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where
the naming suggests that I'm right, it's the architecture's responsibility
to arrange whatever flushing is needed in its copy and clear page functions.

But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts
the new page into page cache, making it accessible by other processes:
that may indeed be reason for flush_dcache_page() there - or a loop of
flush_dcache_page()s.  But I worry then that in the !VM_MAYSHARE case
you would be duplicating expensive flushes: perhaps they should be
restricted to the VM_MAYSHARE block.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]