On 10/13/2016 10:08 AM, js1304@xxxxxxxxx wrote:
From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Currently, freeing page can stay longer in the buddy list if next higher order page is in the buddy list in order to help coalescence. However, it doesn't work for the simplest sequential free case. For example, think about the situation that 8 consecutive pages are freed in sequential order. page 0: attached at the head of order 0 list page 1: merged with page 0, attached at the head of order 1 list page 2: attached at the tail of order 0 list page 3: merged with page 2 and then merged with page 0, attached at the head of order 2 list page 4: attached at the head of order 0 list page 5: merged with page 4, attached at the tail of order 1 list page 6: attached at the tail of order 0 list page 7: merged with page 6 and then merged with page 4. Lastly, merged with page 0 and we get order 3 freepage. With excluding page 0 case, there are three cases that freeing page is attached at the head of buddy list in this example and if just one corresponding ordered allocation request comes at that moment, this page in being a high order page will be allocated and we would fail to make order-3 freepage. Allocation usually happens in sequential order and free also does. So, it
Are you sure this is true except after the system is freshly booted? As soon as it becomes fragmented, a stream of order-0 allocations will likely grab them randomly from all over the place and it's unlikely to recover except small orders.
would be important to detect such a situation and to give some chance to be coalesced. I think that simple and effective heuristic about this case is just attaching freeing page at the tail of the buddy list unconditionally. If freeing isn't merged during one rotation, it would be actual fragmentation and we don't need to care about it for coalescence.
I'm not against removing this heuristic, but not without some benchmarks. The disadvantage of putting pages to tail lists is that they become cache-cold until allocated again. We should check how large that problem is.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> --- mm/page_alloc.c | 25 ++----------------------- 1 file changed, 2 insertions(+), 23 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1790391..c4f7d05 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -858,29 +858,8 @@ static inline void __free_one_page(struct page *page, done_merging: set_page_order(page, order); - /* - * If this is not the largest possible page, check if the buddy - * of the next-highest order is free. If it is, it's possible - * that pages are being freed that will coalesce soon. In case, - * that is happening, add the free page to the tail of the list - * so it's less likely to be used soon and more likely to be merged - * as a higher order page - */ - if ((order < MAX_ORDER-2) && pfn_valid_within(page_to_pfn(buddy))) { - struct page *higher_page, *higher_buddy; - combined_idx = buddy_idx & page_idx; - higher_page = page + (combined_idx - page_idx); - buddy_idx = __find_buddy_index(combined_idx, order + 1); - higher_buddy = higher_page + (buddy_idx - combined_idx); - if (page_is_buddy(higher_page, higher_buddy, order + 1)) { - list_add_tail(&page->lru, - &zone->free_area[order].free_list[migratetype]); - goto out; - } - } - - list_add(&page->lru, &zone->free_area[order].free_list[migratetype]); -out: + list_add_tail(&page->lru, + &zone->free_area[order].free_list[migratetype]); zone->free_area[order].nr_free++; }
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>