Re: [RFC PATCH v2 0/4] mm: reclaim zbud pages on migration and compaction

Minchan Kim <minchan@xxxxxxxxxx> · Mon, 12 Aug 2013 12:49:50 +0900

Hello Benjamin,

On Sun, Aug 11, 2013 at 11:16:47PM -0400, Benjamin LaHaise wrote:
> Hello Minchan,
> 
> On Mon, Aug 12, 2013 at 11:25:35AM +0900, Minchan Kim wrote:
> > Hello,
> > 
> > On Fri, Aug 09, 2013 at 12:22:16PM +0200, Krzysztof Kozlowski wrote:
> > > Hi,
> > > 
> > > Currently zbud pages are not movable and they cannot be allocated from CMA
> > > region. These patches try to address the problem by:
> > 
> > The zcache, zram and GUP pages for memory-hotplug and/or CMA are
> > same situation.
> > 
> > > 1. Adding a new form of reclaim of zbud pages.
> > > 2. Reclaiming zbud pages during migration and compaction.
> > > 3. Allocating zbud pages with __GFP_RECLAIMABLE flag.
> > 
> > So I'd like to solve it with general approach.
> > 
> > Each subsystem or GUP caller who want to pin pages long time should
> > create own migration handler and register the page into pin-page
> > control subsystem like this.
> > 
> > driver/foo.c
> > 
> > int foo_migrate(struct page *page, void *private);
> > 
> > static struct pin_page_owner foo_migrate = {
> >         .migrate = foo_migrate;
> > };
> > 
> > int foo_allocate()
> > {
> >         struct page *newpage = alloc_pages();
> >         set_pinned_page(newpage, &foo_migrate);
> > }
> > 
> > And in compaction.c or somewhere where want to move/reclaim the page,
> > general VM can ask to owner if it founds it's pinned page.
> > 
> > mm/compaction.c
> > 
> >         if (PagePinned(page)) {
> >                 struct pin_page_info *info = get_page_pin_info(page);
> >                 info->migrate(page);
> >                 
> >         }
> > 
> > Only hurdle for that is that we should introduce a new page flag and
> > I believe if we all agree this approch, we can find a solution at last.
> > 
> > What do you think?
> 
> I don't like this approach.  There will be too many collisions in the 
> hash that's been implemented (read: I don't think you can get away with 

Yeb. That's why I'd like to change it with radix tree of pfn as
I mentioned as comment(just used hash for fast prototyping without big
considering).

> a naive implementation for core infrastructure that has to suite all 
> users), you've got a global spin lock, and it doesn't take into account 

I think batching-drain of pinned page would be sufficient for avoiding
global spinlock problem because we have been used it with page-allocator
which is one of most critical hotpath.

> NUMA issues.  The address space migratepage method doesn't have those 

NUMA issues? Could you elaborate it a bit?

> issues (at least where it is usable as in aio's use-case).
> 
> If you're going to go down this path, you'll have to decide if *all* users 
> of pinned pages are going to have to subscribe to supporting the un-pinning 
> of pages, and that means taking a real hard look at how O_DIRECT pins pages.  
> Once you start thinking about that, you'll find that addressing the 
> performance concerns is going to be an essential part of any design work to 
> be done in this area.

True. The patch I included just shows the cocnept so I didn't consider any
performance critical part but if we all agree this arpproch does make sense
and we can implement little overhead, I will step into next phase to enhance
performance.

Thanks for the input, Ben!

> 
> 		-ben
> -- 
> "Thought is the essence of where you are now."
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>