Re: [PATCH v14 07/14] mm: multi-gen LRU: exploit locality in rmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Aug 15, 2022, at 12:13 AM, Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> 
> Searching the rmap for PTEs mapping each page on an LRU list (to test
> and clear the accessed bit) can be expensive because pages from
> different VMAs (PA space) are not cache friendly to the rmap (VA
> space). For workloads mostly using mapped pages, searching the rmap
> can incur the highest CPU cost in the reclaim path.

Impressive work. Sorry if my feedback is not timely.

Just one minor point for thought, that can be left for a later cleanup.

> 
> +	for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) {
> +		unsigned long pfn;
> +
> +		pfn = get_pte_pfn(pte[i], pvmw->vma, addr);
> +		if (pfn == -1)
> +			continue;
> +
> +		if (!pte_young(pte[i]))
> +			continue;
> +
> +		folio = get_pfn_folio(pfn, memcg, pgdat);
> +		if (!folio)
> +			continue;
> +
> +		if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i))
> +			continue;
> +

You have already checked that the PTE is old (not young), so this check
seems redundant. I do not see a way in which the access-bit can be cleared
since you hold the ptl. IOW, there is no need for the “if" and “continue".

Makes me also wonder whether having a separate ptep_clear_young() can
slightly help, since anyhow the access-bit is more of an estimation,
and having a separate ptep_clear_young() can enable optimizations.

On x86, for instance, if the PTE is dirty, we may be able to clear the
access-bit without an atomic operation, which should be faster.






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux