On Wed, Mar 30, 2022 at 09:00:46AM -0400, Johannes Weiner wrote: > On Wed, Mar 30, 2022 at 08:34:08AM +0000, CGEL wrote: > > On Wed, Mar 23, 2022 at 06:11:03AM +0000, CGEL wrote: > > > On Tue, Mar 22, 2022 at 09:27:58AM -0400, Johannes Weiner wrote: > > > > On Tue, Mar 22, 2022 at 02:47:42AM +0000, CGEL wrote: > > > > > On Mon, Mar 21, 2022 at 10:33:20AM -0400, Johannes Weiner wrote: > > > > > > On Wed, Mar 16, 2022 at 06:39:28AM +0000, cgel.zte@xxxxxxxxx wrote: > > > > > > > From: Yang Yang <yang.yang29@xxxxxxxxxx> > > > > > > > > > > > > > > psi tracks the time spent on submitting the IO of refaulting file pages > > > > > > > and anonymous pages[1]. But after we tracks refaulting anonymous pages > > > > > > > in swap_readpage[2][3], there is no need to track refaulting anonymous > > > > > > > pages in submit_bio. > > > > > > > > > > > > > > So this patch can reduce redundant calling of psi_memstall_enter. And > > > > > > > make it easier to track refaulting file pages and anonymous pages > > > > > > > separately. > > > > > > > > > > > > I don't think this is an improvement. > > > > > > > > > > > > psi_memstall_enter() will check current->in_memstall once, detect the > > > > > > nested call, and bail. Your patch checks PageSwapBacked for every page > > > > > > being added. It's more branches for less robust code. > > > > > > > > > > We are also working for a new patch to classify different reasons cause > > > > > psi_memstall_enter(): reclaim, thrashing, compact, etc. This will help > > > > > user to tuning sysctl, for example, if user see high compact delay, he > > > > > may try do adjust THP sysctl to reduce the compact delay. > > > > > > > > > > To support that, we should distinguish what's the reason cause psi in > > > > > submit_io(), this patch does the job. > > > > > > > > Please submit these patches together then. On its own, this patch > > > > isn't desirable. > > > I think this patch has it's independent value, I try to make a better > > > explain. > > You missed the point about it complicating semantics. > > Right now, the bio layer annotates stalls from queue contention. This > is very simple. The swap code has relied on it in the past. It doesn't > now, but that doesn't change what the concept is at the bio layer. > > You patch explicitly codifies that the MM handles swap IOs, and the > lower bio layer handles files. This asymmetry is ugly and error prone. > Yes this asymmetry is bad, we also don't think this patch is perfect. But just as you said below, page cache creating is spread out multiple filesystems, if we move psi_memstall_enter to the upper layer it would maybe horrible repetition and also error prone. I think the primary question is page cache code is not centralized, before it's be solved (not likely?), we may have to make a compromise, and I think this patch is a simple one. Thanks. > If you want type distinction, we should move it all into MM code, like > Christoph is saying. Were swap code handles anon refaults and the page > cache code handles file refaults. This would be my preferred layering, > and my original patch did that: https://lkml.org/lkml/2019/7/22/1070. > > But it was NAKed, and I had to agree with the argument. The page cache > code is not very centralized, and the place where we deal with > individual pages (and detect refaults) and where we submit bios (where > the stalls occur) is spread out into multiple filesystems. There are > 180 submit_bio() calls in fs/; you'd have to audit which ones are for > page cache submissions, and then add stall annotations or use a new > submit_bio_cache() wrapper that handles it. Changes in the filesystem > could easily miss this protocol and silently break pressure detection. > > I would prefer the annotations to be at this level, I just don't see > how to do it cleanly/robustly. Maybe Christoph has an idea, he > understands the fs side much better than I do.