Re: Detecting page cache trashing state

"Ruslan Ruslichenko -X (rruslich - GLOBALLOGIC INC at Cisco)" <rruslich@xxxxxxxxx> · Fri, 27 Oct 2017 23:19:02 +0300

Hi Johannes,

On 10/25/2017 08:54 PM, Johannes Weiner wrote:
Hi Ruslan,

sorry about the delayed response, I missed the new activity in this
older thread.

On Thu, Sep 28, 2017 at 06:49:07PM +0300, Ruslan Ruslichenko -X (rruslich - GLOBALLOGIC INC at Cisco) wrote:
Hi Johannes,

Hopefully I was able to rebase the patch on top v4.9.26 (latest supported
version by us right now)
and test a bit.
The overall idea definitely looks promising, although I have one question on
usage.
Will it be able to account the time which processes spend on handling major
page faults
(including fs and iowait time) of refaulting page?
That's the main thing it should measure! :)

The lock_page() and wait_on_page_locked() calls are where iowaits
happen on a cache miss. If those are refaults, they'll be counted.

As we have one big application which code space occupies big amount of place
in page cache,
when the system under heavy memory usage will reclaim some of it, the
application will
start constantly thrashing. Since it code is placed on squashfs it spends
whole CPU time
decompressing the pages and seem memdelay counters are not detecting this
situation.
Here are some counters to indicate this:

19:02:44        CPU     %user     %nice   %system   %iowait %steal     %idle
19:02:45        all      0.00      0.00    100.00      0.00 0.00      0.00

19:02:44     pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s
pgscand/s pgsteal/s    %vmeff
19:02:45     15284.00      0.00    428.00    352.00  19990.00 0.00      0.00
15802.00      0.00

And as nobody actively allocating memory anymore looks like memdelay
counters are not
actively incremented:

[:~]$ cat /proc/memdelay
268035776
6.13 5.43 3.58
1.90 1.89 1.26
How does it correlate with /proc/vmstat::workingset_activate during
that time? It only counts thrashing time of refaults it can actively
detect.
The workingset counters are growing quite actively too. Here are
some numbers per second:

workingset_refault   8201
workingset_activate   389
workingset_restore   187
workingset_nodereclaim   313

Btw, how many CPUs does this system have? There is a bug in this
version on how idle time is aggregated across multiple CPUs. The error
compounds with the number of CPUs in the system.
The system has 2 CPU cores.
I'm attaching 3 bugfixes that go on top of what you have. There might
be some conflicts, but they should be minor variable naming issues.

I will test with your patches and get back to you.

Thanks,
Ruslan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>