Tejun reported stuttering and latency spikes on a system where random tasks would enter direct reclaim and get stuck on dirty pages. Around 50% of memory was occupied by tmpfs backed by an SSD, and another disk (rotating) was reading and writing at max speed to shrink a partition. Analysis: When calculating the amount of dirtyable memory, the VM considers all free memory and all file and anon pages as baseline to which to apply dirty limits. This implies that, given memory pressure from dirtied cache, the VM would actually start swapping to make room. But alas, this is not really the case and page reclaim tries very hard not to swap as long as there is any used-once cache available. The dirty limit may have been 10-15% of main memory, but page cache was less than 50% of that, which means that a third of the pages that the reclaimers actually looked at were dirty. Kswapd stopped making progress, and in turn allocators were forced into direct reclaim only to get stuck on dirty/writeback congestion. These two patches fix the dirtyable memory calculation to acknowledge the fact that the VM does not really replace anon with dirty cache. As such, anon memory can no longer be considered "dirtyable." Longer term we probably want to look into reducing some of the bias towards cache. The problematic workload in particular was not even using any of the anon pages, one swap burst could have resolved it. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html