On 12/11/19 2:04 PM, Johannes Weiner wrote: > On Wed, Dec 11, 2019 at 12:18:38PM -0800, Linus Torvalds wrote: >> On Wed, Dec 11, 2019 at 12:08 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>> $ cat /proc/meminfo | grep -i active >>> Active: 134136 kB >>> Inactive: 28683916 kB >>> Active(anon): 97064 kB >>> Inactive(anon): 4 kB >>> Active(file): 37072 kB >>> Inactive(file): 28683912 kB >> >> Yeah, that should not put pressure on some swap activity. We have 28 >> GB of basically free inactive file data, and the VM is doing something >> very very bad if it then doesn't just quickly free it with no real >> drama. > > I was looking at this with Jens offline last week. One thing to note > is the rate of IO that Jens is working with: combined with the low > cache hit rate, it was pushing upwards of half a million pages through > the page cache each second. > > There isn't anything obvious sticking out in the kswapd profile: it's > dominated by cache tree deletions (or rather replacing pages with > shadow entries, hence the misleading xas_store()), tree lock > contention, etc. - all work that a direct reclaimer would have to do > as well, with one exceptions: RWC_UNCACHED doesn't need to go through > the LRU list, and 8-9% of kswapd cycles alone are going into > physically getting pages off the list. (And I suspect part of that is > also contention over the LRU lock as kswapd gets overwhelmed and > direct reclaim kicks in). > > Jens, how much throughput difference does kswapd vs RWC_UNCACHED make? It's not huge, like 5-10%. The CPU usage is the most noticable, particularly at the higher IO rates. -- Jens Axboe