On Tue, Jan 7, 2025 at 9:52 AM Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Tue, Jan 7, 2025 at 3:40 AM Lance Yang <ioworker0@xxxxxxxxx> wrote: > > > > On Mon, Jan 6, 2025 at 5:34 PM Baolin Wang > > <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On 2025/1/6 17:03, Barry Song wrote: > > > > On Mon, Jan 6, 2025 at 7:40 PM Baolin Wang > > > > <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > > >> > > > >> > > > >> > > > >> On 2025/1/6 11:17, Barry Song wrote: > > > >>> From: Barry Song <v-songbaohua@xxxxxxxx> > > > >>> > > > >>> The refcount may be temporarily or long-term increased, but this does > > > >>> not change the fundamental nature of the folio already being lazy- > > > >>> freed. Therefore, we only reset 'swapbacked' when we are certain the > > > >>> folio is dirty and not droppable. > > > >>> > > > >>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > > > >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > > > >> > > > >> The changes look good to me. While we are at it, could you also change > > > >> the __discard_anon_folio_pmd_locked() to follow the same strategy for > > > >> lazy-freed PMD-sized folio? > > > > > > > > it seems you mean __discard_anon_folio_pmd_locked() is lacking > > > > folio_set_swapbacked(folio) for dirty pmd-mapped folios? > > > > Good catch! > > > > Hmm... I don't recall why we don't call folio_set_swapbacked for dirty > > THPs in __discard_anon_folio_pmd_locked() - possibly to align with > > previous behavior ;) > > > > If a dirty PMD-mapped THP cannot be discarded, we just split it and > > restart the page walk to process the PTE-mapped THP. After that, we > > will only mark each folio within the THP as swap-backed individually. > > > > It seems like we could cut the work by calling folio_set_swapbacked() > > for dirty THPs directly in __discard_anon_folio_pmd_locked(), skipping > > the restart of the page walk after splitting the THP, IMHO ;) > > Yes, the existing code for PMD-mapped THPs seems quite inefficient. It splits > the PMD-mapped THP into smaller folios, marks each split PTE as dirty, and Apologies for the typo, I meant splitting a PMD-mapped THP into a PTE-mapped THP. > then iterates over each PTE. I’m not sure why it’s designed this way—could > there be a specific reason behind this approach? > > However, it does appear to handle folio_set_swapbacked() correctly, as only > a dirty PMD will result in dirty PTEs being generated in > __split_huge_pmd_locked(): > > } else { > pte_t entry; > > entry = mk_pte(page, READ_ONCE(vma->vm_page_prot)); > if (write) > entry = pte_mkwrite(entry, vma); > > if (!young) > entry = pte_mkold(entry); > > /* NOTE: this may set soft-dirty too on some archs */ > if (dirty) > entry = pte_mkdirty(entry); > > if (soft_dirty) > entry = pte_mksoft_dirty(entry); > > if (uffd_wp) > entry = pte_mkuffd_wp(entry); > > for (i = 0; i < HPAGE_PMD_NR; i++) > VM_WARN_ON(!pte_none(ptep_get(pte + i))); > > set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR); > } > > > > > > > Thanks, > > Lance > > > > > > > > and it seems !(vma->vm_flags & VM_DROPPABLE) is also not > > > > handled properly? > > > > > > > > > > Right. > > Thanks > Barry