On 12/05/2017 01:50 PM, zhong jiang wrote: >>> yes, limited by my knowledge and english. Maybe Vlastimil can address it in detail. >> Hi, on a fresh look, I believe this patch doesn't improve anything in >> practice. It potentially makes init_pages_in_zone() catch more early >> allocations, if a hole happens to be placed in the beginning of >> MAX_ORDER block, and the following pageblock within the block was early >> allocated. > Hi, Vlastimil > > I have a stupid question about holes > > because a hole is possible to have within a MAX_ORDER_NR_PAGES, it indeed > exist in first pfn. it that is true, why we must skip the whole MAX_ORDER block? > Any limit ? I can not find the answer. It's not that we "must skip". If I understand it correctly, on kernels without CONFIG_HOLES_IN_ZONE, we can skip a MAX_ORDER block if *any* pfn (including the first pfn) is invalid, because we know that the whole block is invalid. On CONFIG_HOLES_IN_ZONE, there is no such guarantee. So if we see that the first pfn is valid, we continue with the block, but use pfn_valid_within() (which is defined as pfn_valid() on CONFIG_HOLES_IN_ZONE and hardcoded "true" elsewhere) to validate each pfn. This is slow, but the arches pay the price for CONFIG_HOLES_IN_ZONE. If we see that first pfn is invalid, we are safe to skip the MAX_ORDER block when CONFIG_HOLES_IN_ZONE=n and we know we won't miss anything. On CONFIG_HOLES_IN_ZONE we might miss something, so to be sure we don't miss something, we should validate each pfn. The potential price there is probably worse, because we might be validating arbitrary large holes not limited by physical amount of RAM. So e.g. compaction doesn't pay this price, and MAX_ORDER blocks that would have hole at the beginning and end (with valid pages in the middle) are skipped. page_owner on the other hand is a debugging feature not normally enabled, with significant overhead, so paying the price there might not be an issue. But it means rewriting both init_pages_in_zone() and read_page_owner() to not skip MAX_ORDER block (nor pageblock_order) when CONFIG_HOLES_IN_ZONE=y. I don't think there's a simple wrapper similar to pfn_valid_within() for that, but it could be created (input: current pfn, output: start pfn of next MAX_ORDER block if CONFIG_HOLES_IN_ZONE=n, pfn+1 when CONFIG_HOLES_IN_ZONE=y). > Thanks > zhongjiang >> However, read_page_owner() skips whole MAX_ORDER block as well in this >> situation, so we won't be able to read the info anyway... >> >> Also the problem is not as simple as documenting MAX_ORDER_NR_PAGES vs >> pabeblock_nr_pages. We discussed it year ago when this patch was first >> posted, how skipping over holes would have to be made more robust, and >> how architectures should define hole granularity to avoid checking each >> individual pfn in what appears to be a hole, to see if the hole has ended. >> >>> Thanks >>> zhongjiang >>> >> >> . >> > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>