On Mon, Jan 20, 2025 at 05:39:43PM +0100, Alexander Gordeev wrote: > Despite the commit 75edd345e8ed claim above I still observed that a read > access to shmem page could end up in creation of a dirty PTE: > > handle_pte_fault() -> do_pte_missing() -> do_fault() -> > do_read_fault() -> finish_fault() -> set_pte_range() -> mk_pte() > > Our internal discussion (among other things) ended up in a need to really > understand where the faults are coming from. > > Further, a cursory LTP test was showing ~18K page faults increase, which > I did not confirm. That is the first thing I will re-do. > > Whether this change is a pre-requisite for something or what is your aim > wrt to this patch? One of the things that needs to happen for the splitting of struct page and struct folio is the removal of all '&folio->page'. Most are in compatibility code or code that is being transitioned and can safely be ignored. One of the places that needs to be changed is: entry = mk_pte(&folio->page, vma->vm_page_prot); (there's a few places like this). I believe we need a folio_mk_pte(), and I don't want to define it for each architecture. I don't even want to have a default implementation and allow arches to override. So the question becomes whether to: (a) Get rid of the conditional pte_mkdirty() entirely (this trial balloon) (b) Put it in folio_mk_pte() for everybody, not just s390 (c) Put it in set_pte_range() as David suggested. It's feeling like (c) is the best idea.