On 04/01/2015 12:11 AM, Sasha Levin wrote:
Freeing pages became a rather costly operation, specially when multiple debug options are enabled. This causes hangs when an attempt to free a large amount of 0-order is made. Two examples are vfree()ing large block of memory, and punching a hole in a shmem filesystem. To avoid that, move any free operations that involve batching pages into a list to a workqueue handler where they could be freed later.
Is there a risk of creating a situation where memory is apparently missing, because the work item hasn't been processed? Leading to allocation failures, needless reclaim, spurious OOM, etc? If yes, such situations should probably wait for completion of the work first?
And maybe it shouldn't be used everywhere (as patch 2/2 does) but only where it makes sense. Process exits, maybe?
Signed-off-by: Sasha Levin <sasha.levin@xxxxxxxxxx> --- mm/page_alloc.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 46 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5bd9711..812ca75 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1586,10 +1586,11 @@ out: local_irq_restore(flags); } -/* - * Free a list of 0-order pages - */ -void free_hot_cold_page_list(struct list_head *list, bool cold) +static LIST_HEAD(free_hot_page_list); +static LIST_HEAD(free_cold_page_list); +static DEFINE_SPINLOCK(free_page_lock); + +static void __free_hot_cold_page_list(struct list_head *list, bool cold) { struct page *page, *next; @@ -1599,6 +1600,47 @@ void free_hot_cold_page_list(struct list_head *list, bool cold) } } +static void free_page_lists_work(struct work_struct *work) +{ + LIST_HEAD(hot_pages); + LIST_HEAD(cold_pages); + unsigned long flags; + + spin_lock_irqsave(&free_page_lock, flags); + list_cut_position(&hot_pages, &free_hot_page_list, + free_hot_page_list.prev); + list_cut_position(&cold_pages, &free_cold_page_list, + free_cold_page_list.prev); + spin_unlock_irqrestore(&free_page_lock, flags); + + __free_hot_cold_page_list(&hot_pages, false); + __free_hot_cold_page_list(&cold_pages, true); +} + +static DECLARE_WORK(free_page_work, free_page_lists_work); + +/* + * Free a list of 0-order pages + */ +void free_hot_cold_page_list(struct list_head *list, bool cold) +{ + unsigned long flags; + + if (unlikely(!keventd_up())) { + __free_hot_cold_page_list(list, cold); + return; + } + + spin_lock_irqsave(&free_page_lock, flags); + if(cold) + list_splice_tail(list, &free_cold_page_list); + else + list_splice_tail(list, &free_hot_page_list); + spin_unlock_irqrestore(&free_page_lock, flags); + + schedule_work(&free_page_work); +} + /* * split_page takes a non-compound higher-order page, and splits it into * n (1<<order) sub-pages: page[0..n]
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>