Re: [PATCH] s390: Remove PageDirty check inside mk_pte()

Alexander Gordeev <agordeev@xxxxxxxxxxxxx> · Mon, 3 Feb 2025 14:03:02 +0100

On Thu, Jan 30, 2025 at 10:12:48PM +0000, Matthew Wilcox wrote:
> On Mon, Jan 20, 2025 at 05:39:43PM +0100, Alexander Gordeev wrote:
> > Despite the commit 75edd345e8ed claim above I still observed that a read
> > access to shmem page could end up in creation of a dirty PTE:
> > 
> > 	handle_pte_fault() -> do_pte_missing() -> do_fault() ->
> > 	do_read_fault() -> finish_fault() -> set_pte_range() -> mk_pte()
> > 
> > Our internal discussion (among other things) ended up in a need to really
> > understand where the faults are coming from.
> > 
> > Further, a cursory LTP test was showing ~18K page faults increase, which
> > I did not confirm. That is the first thing I will re-do.
> > 
> > Whether this change is a pre-requisite for something or what is your aim
> > wrt to this patch?
> 
> One of the things that needs to happen for the splitting of struct page
> and struct folio is the removal of all '&folio->page'.  Most are in
> compatibility code or code that is being transitioned and can safely be
> ignored.  One of the places that needs to be changed is:
> 
> 	entry = mk_pte(&folio->page, vma->vm_page_prot);
> (there's a few places like this).
> 
> I believe we need a folio_mk_pte(), and I don't want to define it for
> each architecture.  I don't even want to have a default implementation
> and allow arches to override.

I guess, folio_mk_pte() would then still call arch-specific mk_pte()?

> So the question becomes whether to:
> 
> (a) Get rid of the conditional pte_mkdirty() entirely (this trial
>     balloon)
> (b) Put it in folio_mk_pte() for everybody, not just s390
> (c) Put it in set_pte_range() as David suggested.
> 
> It's feeling like (c) is the best idea.

I will check option (c)

Thanks!