On Wed, Sep 08, 2021 at 11:35:40AM +0800, Zhaoyang Huang wrote: > On Tue, Sep 7, 2021 at 9:24 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > On Tue, Sep 07, 2021 at 08:15:30PM +0800, Zhaoyang Huang wrote: > > > On Tue, Sep 7, 2021 at 8:03 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > > > > > > > On 9/7/21 13:59, Huangzhaoyang wrote: > > > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > > > > > > > > > It doesn't make sense to count IO time into psi memstall. Bail out after > > > > > bio submitted. > > > > > > > > Isn't that the point if psi, to observe real stalls, which include IO? > > > > Yes, correct. > > > > > IO stalls could be observed within blk_io_schedule. The time cost of > > > the data from block device to RAM is counted here. > > > > Yes, that is on purpose. The time a thread waits for swap read IO is > > time in which the thread is not productive due to a lack of memory. > > > > For async-submitted IO, this happens in lock_page() called from > > do_swap_page(). If the submitting thread directly waits after the > > submit_bio(), then that should be accounted too. > IMO, memstall counting should be terminated by bio submitted. blk > driver fetching request and the operation on the real device shouldn't > be counted in. It especially doesn't make sense in a virtualization > system like XEN etc, where the blk driver is implemented via > backend-frontend way that introduce memory irrelevant latency Yes but the entire IO operation and all the associated latency only happens due to a shortage of memory in the first place. The thread is incurring these delays due to a lack of memory. What is a memstall if not the latencies and wait times incurred in the process of reloading pages that were evicted prematurely?