On Fri, Sep 03, 2010 at 07:29:43PM +0900, KAMEZAWA Hiroyuki wrote: > On Thu, 2 Sep 2010 17:54:24 +0900 > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > > > Here is a rough code for this. > > here is a _tested_ one. > If I tested correctly, I allocated 40MB of contigous pages by the new funciton. > I'm grad this can be some hints for people. Great! I didn't look into the detail but the concept seems to be good. If someone doesn't need complex intelligent(ex, shared, private, [first|best] fit, buddy), this is enough for that. So I think this will be good regardless of CMA. I will look into this more detaily and think idea to improve. Thanks, Kame. :) > > Thanks, > -Kame > == > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > > This patch as a memory allocator for contiguous memory larger than MAX_ORDER. > > alloc_contig_pages(hint, size, list); > > This function allocates 'size' of contigoues pages, whose physical address > is higher than 'hint'. size is specicied in byte unit. size is byte, hint is pfn? > Allocated pages are all linked into the list and all of their page_count() > are set to 1. Return value is the top page. > > free_contig_pages(list) > returns all pages in the list. > > This patch does > - find an area which can be ISOLATED. > - migrate remaining pages in the area. Migrate from there to where? > - steal chunk of pages from allocator. > > Limitation is: > - retruned pages will be aligend to MAX_ORDER. > - returned length of page will be aligned to MAX_ORDER. > (so, the caller may have to return tails of pages by itself.) What do you mean tail? > - may allocate contiguous pages which overlap node/zones. Hmm.. Do we really need this? > > This is fully experimental and written as example. > (Maybe need more patches to make this complete.) Yes. But first impression of this patch is good to me. > > This patch moves some amount of codes from memory_hotplug.c to > page_isolation.c and based on page-offline technique used by > memory_hotplug.c > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > --- > include/linux/page-isolation.h | 10 + > mm/memory_hotplug.c | 84 -------------- > mm/page_alloc.c | 32 +++++ > mm/page_isolation.c | 244 +++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 287 insertions(+), 83 deletions(-) > > Index: mmotm-0827/mm/page_isolation.c > =================================================================== > --- mmotm-0827.orig/mm/page_isolation.c > +++ mmotm-0827/mm/page_isolation.c > @@ -3,8 +3,11 @@ > */ > > #include <linux/mm.h> > +#include <linux/swap.h> > #include <linux/page-isolation.h> > #include <linux/pageblock-flags.h> > +#include <linux/mm_inline.h> > +#include <linux/migrate.h> > #include "internal.h" > > static inline struct page * > @@ -140,3 +143,244 @@ int test_pages_isolated(unsigned long st > spin_unlock_irqrestore(&zone->lock, flags); > return ret ? 0 : -EBUSY; > } > + > +#define CONTIG_ALLOC_MIGRATION_RETRY (5) > + > +/* > + * Scanning pfn is much easier than scanning lru list. > + * Scan pfn from start to end and Find LRU page. > + */ > +unsigned long scan_lru_pages(unsigned long start, unsigned long end) > +{ > + unsigned long pfn; > + struct page *page; > + for (pfn = start; pfn < end; pfn++) { > + if (pfn_valid(pfn)) { > + page = pfn_to_page(pfn); > + if (PageLRU(page)) > + return pfn; > + } > + } > + return 0; > +} > + > +/* Migrate all LRU pages in the range to somewhere else */ > +static struct page * > +hotremove_migrate_alloc(struct page *page, unsigned long private, int **x) > +{ > + /* This should be improooooved!! */ Yeb. > + return alloc_page(GFP_HIGHUSER_MOVABLE); > +} <snip> > +struct page *alloc_contig_pages(unsigned long long hint, > + unsigned long size, struct list_head *list) > +{ > + unsigned long base, found, end, pages, start; > + struct page *ret = NULL; > + int nid, retry; > + > + if (hint) > + hint = ALIGN(hint, MAX_ORDER_NR_PAGES); > + /* request size should be aligned to pageblock */ > + size >>= PAGE_SHIFT; > + pages = ALIGN(size, MAX_ORDER_NR_PAGES); > + found = 0; > +retry: > + for_each_node_state(nid, N_HIGH_MEMORY) { > + unsigned long node_end; > + pg_data_t *node = NODE_DATA(nid); > + > + node_end = node->node_start_pfn + node->node_spanned_pages; > + /* does this node have proper range of memory ? */ > + if (node_end < hint + pages) > + continue; > + base = hint; > + if (base < node->node_start_pfn) > + base = node->node_start_pfn; > + > + base = ALIGN(base, MAX_ORDER_NR_PAGES); > + found = 0; > + end = node_end & ~(MAX_ORDER_NR_PAGES -1); > + /* Maybe we can use this Node */ > + if (base + pages < end) > + found = __find_contig_block(base, end, pages); > + if (found) /* Found ? */ > + break; > + base = hint; > + } > + if (!found) > + goto out; > + /* > + * Ok, here, we have contiguous pageblock marked as "isolated" > + * try migration. > + */ > + retry = CONTIG_ALLOC_MIGRATION_RETRY; > + end = found + pages; Hmm.. I can't understand below loop. Maybe need refactoring. > + for (start = scan_lru_pages(found, end); start < end;) { > + > + if (do_migrate_range(found, end)) { > + /* migration failure ... */ > + if (retry-- < 0) > + break; > + /* take a rest and synchronize LRU etc. */ > + lru_add_drain_all(); > + flush_scheduled_work(); > + cond_resched(); > + drain_all_pages(); > + } > + start = scan_lru_pages(start, end); > + if (!start) > + break; > + } <snip> > +void alloc_contig_freed_pages(unsigned long pfn, unsigned long end, > + struct list_head *list) > +{ > + struct page *page; > + struct zone *zone; > + int i, order; > + > + zone = page_zone(pfn_to_page(pfn)); > + spin_lock_irq(&zone->lock); > + while (pfn < end) { > + VM_BUG_ON(!pfn_valid(pfn)); > + page = pfn_to_page(pfn); > + VM_BUG_ON(page_count(page)); > + VM_BUG_ON(!PageBuddy(page)); > + list_del(&page->lru); > + order = page_order(page); > + zone->free_area[order].nr_free--; > + rmv_page_order(page); > + __mod_zone_page_state(zone, NR_FREE_PAGES, - (1UL << order)); > + for (i = 0;i < (1 << order); i++) { > + struct page *x = page + i; > + list_add(&x->lru, list); > + } > + page += 1 << order; ^ pfn? > + } > + spin_unlock_irq(&zone->lock); > + > + /*After this, pages on the list can be freed one be one */ > + list_for_each_entry(page, list, lru) > + prep_new_page(page, 0, 0); > +} > + > #ifdef CONFIG_MEMORY_HOTREMOVE > /* > * All pages in the range must be isolated before calling this. > -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html