On Thu, Jan 19, 2017 at 01:52:38PM +0100, Vlastimil Babka wrote: > On 01/13/2017 08:14 AM, js1304@xxxxxxxxx wrote: > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > > > We have a statistic about memory fragmentation but it would be fluctuated > > a lot within very short term so it's hard to accurately measure > > system's fragmentation state while workload is actively running. Without > > stable statistic, it's not possible to determine if the system is > > fragmented or not. > > > > Meanwhile, recently, there were a lot of reports about fragmentation > > problem and we tried some changes. However, since there is no way > > to measure fragmentation ratio stably, we cannot make sure how these > > changes help the fragmentation. > > > > There are some methods to measure fragmentation but I think that they > > have some problems. > > > > 1. buddyinfo: it fluctuated a lot within very short term > > 2. tracepoint: it shows how steal happens between buddylists of different > > migratetype. It means fragmentation indirectly but would not be accurate. > > 3. pageowner: it shows the number of mixed pageblocks but it is not > > suitable for production system since it requires some additional memory. > > > > Therefore, this patch try to calculate exponential moving average to > > unusable free index. Since it is a moving average, it is quite stable > > even if fragmentation state of memory fluctuate a lot. > > I suspect that the fluctuation of the underlying unusable free index > isn't so much because the number of high-order free blocks would > fluctuate, but because of allocation vs reclaim changing the total > number of free blocks, which is used in the equation. Reclaim uses LRU > which I expect to have low correlation with pfn, so the freed pages tend > towards order-0. And the allocation side tries not to split large pages > so it also consumes mostly order-0. I introduced this metric because I observed fluctuation of unusable free index. :) > > So I would expect just plain free_blocks_order from contig_page_info to > be a good metric without need for averaging, at least for costly orders > and when we have enough free memory - if we are below e.g. the high > (order-0) watermark, then we should let kswapd do its job first anyway > before considering proactive compaction. Maybe, plain free_blocks_order would be stable for the order 7 or more but it's better to have the metric that works well for all orders. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>