Lock overhead in shrink_inactive_list / Slow page reclamation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

We have a performance issue with the page cache. One of our workload
spends more than 50% of it's time in the lru_locks called by
shrink_inactive_list in mm/vmscan.c.

The workload is simple but stresses the page cache a lot: a big file
is mmaped and multiple threads stream chunks of the file; the chunks
sizes range from a few KB to a few MB. The file is about 1TB and is
stored on a very fast SSD (2.6GB/s bandwidth). Our machine has 64GB of
RAM. We rely on the page cache to cache data, but obviously pages have
to be reclaimed quite often to put new data. The workload is *read
only* so we would expect page reclamation to be fast, but it's not. In
some workloads the page cache only reclaims pages at 500-600MB/s.

We have tried to play with fadvise to speed up page reclamation (e.g.,
using the DONTNEED flag) but that didn't help.

Increasing the value of SWAP_CLUSTER_MAX to 256UL helped (as suggested
here https://lkml.org/lkml/2015/7/6/440), but we are still spending
most of the time waiting for the page cache to reclaim pages.
Increasing the value to more than 256 doesn't help -- the
shrink_inactive_list function is never reclaiming more than a few
hundred pages at a time. (I don't know why, and I'm not sure how to
profile why this is the case, but I'm willing to spend time to debug
the issue if you have ideas.)

Any idea of anything else we could try to speed up page reclamation?

Thanks,
Baptiste.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux