On Sun, Dec 24, 2023 at 2:07 PM Chris Li <chrisl@xxxxxxxxxx> wrote: > > On Sun, Dec 24, 2023 at 1:13 PM David Rientjes <rientjes@xxxxxxxxxx> wrote: > > > > On Sun, 24 Dec 2023, Chris Li wrote: > > > > > On Sat, Dec 23, 2023 at 7:01 PM David Rientjes <rientjes@xxxxxxxxxx> wrote: > > > > > > > > On Sat, 23 Dec 2023, Chris Li wrote: > > > > > > > > > > How do you quantify the impact of the delayed swap_entry_free()? > > > > > > > > > > > > Since the free and memcg uncharge are now delayed, is there not the > > > > > > possibility that we stay under memory pressure for longer? (Assuming at > > > > > > least some users are swapping because of memory pressure.) > > > > > > > > > > > > I would assume that since the free and uncharge itself is delayed that in > > > > > > the pathological case we'd actually be swapping *more* until the async > > > > > > worker can run. > > > > > > > > > > Thanks for raising this interesting question. > > > > > > > > > > First of all, the swap_entry_free() does not impact "memory.current". > > > > > It reduces "memory.swap.current". Technically it is the swap pressure > > > > > not memory pressure that suffers the extra delay. > > > > > > > > > > Secondly, we are talking about delaying up to 64 swap entries for a > > > > > few microseconds. > > > > > > > > What guarantees that the async freeing happens within a few microseconds? > > > > > > Linux kernel typically doesn't provide RT scheduling guarantees. You > > > can change microseconds to milliseconds, my following reasoning still > > > holds. > > > > > > > What guarantees that the async freeing happens even within 10s? Your > > responses are implying that there is some deadline by which this freeing > > absolutely must happen (us or ms), but I don't know of any strong > > guarantees. > > I think we are in agreement there, there are no such strong guarantees > in linux scheduling. However, when there are free CPU resources, the > job will get scheduled to execute in a reasonable table time frame. If > it does not, I consider that a bug if the CPU has idle resources and > the pending jobs are not able to run for a long time. > The existing code doesn't have such a guarantee either, see my point > follows. I don't know why you want to ask for such a guarantee. > > > If there are no absolute guarantees about when the freeing may now occur, > > I'm asking how the impact of the delayed swap_entry_free() can be > > quantified. > > Presumably each application has their own SLO metrics for monitoring > their application behavior. I am happy to take a look if any app has > new SLO violations caused by this change. > If you have one metric in mind, please name it so we can look at it > together. During my current experiment and the chromebook benchmark, I > haven't noticed such ill effects show up in the other metrics drops in > a statistically significant manner. That is not the same as saying > such drops don't exist at all. Just I haven't noticed or the SLO > watching system hasn't caught it. > > > The benefit to the current implementation is that there *are* strong > > guarantees about when the freeing will occur and cannot grow exponentially > > before the async worker can do the freeing. > > I don't understand your point. Please help me. In the current code, > for the previous swapin fault that releases the swap slots into the > swap slot caches. Let's say the swap slot remains in the cache for X > seconds until Nth (N < 64) swapin page fault later, the cache is full > and finally all 64 swap slot caches are free. Are you suggesting there > is some kind of guarantee X is less than some fixed bound seconds? > What is that bound then? 10 second? 1 minutes? > > BTW, there will be no exponential growth, that is guaranteed. Until > the 64 entries cache were freed. The swapin code will take the direct > free path for the current swap slot in hand. The direct free path > existed before my change. FWIW, it's 64 * the number of CPUs.