On Tue, Sep 20, 2022 at 05:38:30PM +0800, Zhenhua Huang wrote: > > > > Also this patch doesn't really explain why it should work and honestly > > > > it doesn't really make much sense to me either. > > > Sorry, my fault. IMO, The reason it should work is, say for this case of > > > order 3 allocation: we can perform direct reclaim more times as we have only > > > order 2 pages(which *lowered* by this change) in free_list(8214*16kB (UEC)). > > > The order requirement which I have lowered is should_reclaim_retry -> > > > __zone_watermark_ok: > > > for (o = order; o < MAX_ORDER; o++) { > > > struct free_area *area = &z->free_area[o]; > > > ... > > > for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) { > > > if (!free_area_empty(area, mt)) > > > return true; > > > } > > > > > > Order 2 pages can be more easily met, hence VM has more chance to return > > > true from should_reclaim_retry. > > > > This is a wrong approach to the problem because there is no real > > guarantee the reclaim round will do anything useful. You should be > > really looking at the compaction side of the thing. > > Thanks Michal for the advice, I'll look at from compaction side also. But I > have one further question, IMO reclaim(~2GB LRU pages can be reclaimed) > should be more feasible compared to compaction(already tried with highest > prio and failed) in this case? Could you please elaborate more...it seems I > still not fully understand why it's a wrong approach to check from reclaim > phase. > Because it risks major slowdowns due to excessive reclaim. Early support used "lumpy reclaim" instead of compaction and resulted in major stalls when trying to allocate THP resulting in THP often being disabled. The success rates were great but systems could become unusable for several minutes and ultimately this resulted in compaction and the current backoff logic of reclaim. Your scenario is similar, you want to aggressively trying to shrink slabs in case an order-3 block of pages gets freed. It might succeed but the system grinds to a halt with excessive re-reading of information from the disk for other use cases. Your focus likely should be on reclaim and compaction aborting prematurely because free CMA pages are available at the correct order but the calling context cannot use CMA pages. It's strange to hear of a driver that has a strict need for order-3 pages being available at all times due to a lack of an IOMMU because that is going to be fragile. One point of CMA was to carve out a region for such drivers so they could the contiguous regions they needed. I believe phone cameras were an early example. If your driver has strict requirements for high-order page availability then CMA probably should be configured and the driver should use CMA. -- Mel Gorman SUSE Labs