Usama Arif <usamaarif642@xxxxxxxxx> writes: > > This patch-series is an attempt to mitigate the issue of running out of > memory when THP is always enabled. During runtime whenever a THP is being > faulted in or collapsed by khugepaged, the THP is added to a list. > Whenever memory reclaim happens, the kernel runs the deferred_split > shrinker which goes through the list and checks if the THP was underutilized, > i.e. how many of the base 4K pages of the entire THP were zero-filled. Sometimes when writing a benchmark I fill things with zero explictly to avoid faults later. For example if you want to measure memory read bandwidth you need to fault the pages first, but that fault pattern may well be zero. With your patch if there is memory pressure there are two effects: - If things are remapped to the zero page the benchmark reading memory may give unrealistically good results because what is thinks is a big memory area is actually only backed by a single page. - If I expect to write I may end up with an unexpected zeropage->real memory fault if the pages got remapped. I expect such patterns can happen without benchmarking too. I could see it being a problem for latency sensitive applications. Now you could argue that this all should only happen under memory pressure and when that happens things may be slow anyways and your patch will still be an improvement. Maybe that's true but there might be still corner cases which are negatively impacted by this. I don't have a good solution other than a tunable, but I expect it will cause problems for someone. The other problem I have with your patch is that it may cause the kernel to pollute CPU caches in the background, which again will cause noise in the system. Instead of plain memchr_inv, you should probably use some primitive to bypass caches or use a NTA prefetch hint at least. -Andi