On Tue, Apr 2, 2019 at 3:45 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Tue 02-04-19 15:38:02, Yafang Shao wrote: > > On Tue, Apr 2, 2019 at 3:23 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Tue 02-04-19 14:15:20, Yafang Shao wrote: > > > > We found that some latency spike was caused by page cache miss on our > > > > database server. > > > > So we decide to measure the page cache miss. > > > > Currently the kernel is lack of this facility for measuring it. > > > > > > What are you going to use this information for? > > > > > > > With this counter, we can monitor pgcachemiss per second and this can > > give us some informaton that > > whether the database performance issue is releated with pgcachemiss. > > For example, if this value increase suddently, it always cause latency spike. > > > > What's more, I also want to measure how long this page cache miss may cause, > > but this seems more complex to implement. > > Aren't tracepoints a better fit with this usecase? You not only get the > count of misses but also the latency. Btw. latency might be caused also > for the minor fault when you hit lock contention. > > I have think about tracepoints before, the reason why I don't choose it is that the implementation is a little more complex. I will rethinking it. > > > > > > This patch introduces a new vm counter PGCACHEMISS for this purpose. > > > > This counter will be incremented in bellow scenario, > > > > - page cache miss in generic file read routine > > > > - read access page cache miss in mmap > > > > - read access page cache miss in swapin > > > > > > > > NB, readahead routine is not counted because it won't stall the > > > > application directly. > > > > > > Doesn't this partially open the side channel we have closed for mincore > > > just recently? > > > > > > > Seems I missed this dicussion. > > Could you pls. give a reference to it? > > The long thread starts here http://lkml.kernel.org/r/nycvar.YFH.7.76.1901051817390.16954@xxxxxxxxxxxxx > -- > Michal Hocko > SUSE Labs