On Tue, Jan 7, 2025 at 3:40 AM Lance Yang <ioworker0@xxxxxxxxx> wrote: > > On Mon, Jan 6, 2025 at 5:34 PM Baolin Wang > <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On 2025/1/6 17:03, Barry Song wrote: > > > On Mon, Jan 6, 2025 at 7:40 PM Baolin Wang > > > <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > >> > > >> > > >> > > >> On 2025/1/6 11:17, Barry Song wrote: > > >>> From: Barry Song <v-songbaohua@xxxxxxxx> > > >>> > > >>> The refcount may be temporarily or long-term increased, but this does > > >>> not change the fundamental nature of the folio already being lazy- > > >>> freed. Therefore, we only reset 'swapbacked' when we are certain the > > >>> folio is dirty and not droppable. > > >>> > > >>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > > >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > > >> > > >> The changes look good to me. While we are at it, could you also change > > >> the __discard_anon_folio_pmd_locked() to follow the same strategy for > > >> lazy-freed PMD-sized folio? > > > > > > it seems you mean __discard_anon_folio_pmd_locked() is lacking > > > folio_set_swapbacked(folio) for dirty pmd-mapped folios? > > Good catch! > > Hmm... I don't recall why we don't call folio_set_swapbacked for dirty > THPs in __discard_anon_folio_pmd_locked() - possibly to align with > previous behavior ;) > > If a dirty PMD-mapped THP cannot be discarded, we just split it and > restart the page walk to process the PTE-mapped THP. After that, we > will only mark each folio within the THP as swap-backed individually. > > It seems like we could cut the work by calling folio_set_swapbacked() > for dirty THPs directly in __discard_anon_folio_pmd_locked(), skipping > the restart of the page walk after splitting the THP, IMHO ;) Yes, the existing code for PMD-mapped THPs seems quite inefficient. It splits the PMD-mapped THP into smaller folios, marks each split PTE as dirty, and then iterates over each PTE. I’m not sure why it’s designed this way—could there be a specific reason behind this approach? However, it does appear to handle folio_set_swapbacked() correctly, as only a dirty PMD will result in dirty PTEs being generated in __split_huge_pmd_locked(): } else { pte_t entry; entry = mk_pte(page, READ_ONCE(vma->vm_page_prot)); if (write) entry = pte_mkwrite(entry, vma); if (!young) entry = pte_mkold(entry); /* NOTE: this may set soft-dirty too on some archs */ if (dirty) entry = pte_mkdirty(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); if (uffd_wp) entry = pte_mkuffd_wp(entry); for (i = 0; i < HPAGE_PMD_NR; i++) VM_WARN_ON(!pte_none(ptep_get(pte + i))); set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR); } > > Thanks, > Lance > > > > > and it seems !(vma->vm_flags & VM_DROPPABLE) is also not > > > handled properly? > > > > > > Right. Thanks Barry