Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 31.01.24 03:30, Yin Fengwei wrote:


On 1/29/24 22:32, David Hildenbrand wrote:
+static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm,
+		unsigned long addr, pte_t *ptep, unsigned int nr, int full)
+{
+	pte_t pte, tmp_pte;
+
+	pte = ptep_get_and_clear_full(mm, addr, ptep, full);
+	while (--nr) {
+		ptep++;
+		addr += PAGE_SIZE;
+		tmp_pte = ptep_get_and_clear_full(mm, addr, ptep, full);
+		if (pte_dirty(tmp_pte))
+			pte = pte_mkdirty(pte);
+		if (pte_young(tmp_pte))
+			pte = pte_mkyoung(pte);
I am wondering whether it's worthy to move the pte_mkdirty() and pte_mkyoung()
out of the loop and just do it one time if needed. The worst case is that they
are called nr - 1 time. Or it's just too micro?

I also thought about just indicating "any_accessed" or "any_dirty" using flags to the caller, to avoid the PTE modifications completely. Felt a bit micro-optimized.

Regarding your proposal: I thought about that as well, but my assumption was that dirty+young are "cheap" to be set.

On x86, pte_mkyoung() is setting _PAGE_ACCESSED.
pte_mkdirty() is setting _PAGE_DIRTY | _PAGE_SOFT_DIRTY, but it also has to handle the saveddirty handling, using some bit trickery.

So at least for pte_mkyoung() there would be no real benefit as far as I can see (might be even worse). For pte_mkdirty() there might be a small benefit.

Is it going to be measurable? Likely not.

Am I missing something?

Thanks!

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux